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Abstract 


This thesis deals with challenging problems related to the interpretation and 
quantification of medical images. We develop and apply novel statistical models of shape 
and appearance to improve the support of various medical practices. We are particularly 
concerned with two spine-related tasks: the segmentation of vertebral structures in x-ray 
images, and the analysis of vertebral deformities. These tasks are relevant for various 
clinical activities such as the diagnosis of spine-related diseases, surgical planning, and 
studies of disease progression. Within this context, this thesis makes contributions in the 
following areas: 

First, in an attempt to overcome the challenges associated with the modeling of high- 
dimensional shape space, we propose a novel multi-scale statistical shape model. The 
new model uses concepts of sparsity, best basis selection and Independent Component 
Analysis (ICA) to extract multi-scale modes of deformations. This method allows for the 
construction of a localized non-linear shape model with clinically-relevant modes of 
deformation. 

Second, we present a novel framework for shape analysis and classification of vertebral 
structures by using pathology-related Wavelet-ICA shape parameters. Within a statistical 
framework, we are able to establish a statistically relevant relationship between the 
wavelet-ICA modes of deformations and clinical information. The framework is then 
extended to achieve multi-vertebrae classification by using Wavelet-ICA parameters. 
Thanks to the locality of the features, the new classification scheme can assess the 
condition of several vertebrae simultaneously. Our results show that significant 
differences are found between groups of healthy and pathological vertebrae. 

Third, we develop a novel method for vertebral segmentation by using localized 


appearance models and a multi-scale shape prior. The new segmentation algorithm 
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incorporates the following: (i) a method for salient point detection in the non sub-sampled 
Contourlet domain, (11) contoneeh a local appearance profiles, and (i11) a multi-scale 
shape prior to drive the segmentation process. We evaluate the ability of the proposed 
method to accurately segment normal and pathological vertebral structures. 

Finally, we tackle the challenges related to the accurate modeling of the texture and 
scarcity of training data. We present a new sparse texture appearance model based upon 
the Contourlet Transform and Independent Component Analysis (ICA). The new model 
benefits from the non-linear approximation power of Contourlets to achieve high 
reduction rates. ICA is then employed to capture localized texture variations. We also 
describe a general framework that integrates our developed shape and texture models in a 


unified framework for synthesis of photo-realistic pathological x-ray images. 


bovlepreta ine aon ‘onl ani 1 ira ow sles volt vs 
siete fi) bine chia somes tio ton 
bemcy wi Yo wrileca oly siaitevo aw “oti 
oy Ro 1 dy ao hinigdlo sting haw’ : 

as como? oft te gablubenn ofprion: sa ci} (Gown boy 


nore fiodacl Lobo rote h SUiIASE ne Wir 2 ot ( 


betwen wer on} A) 28 a Meade bie wits esa 
Gelso Ve vareeientiey Saint) Goxrisoog Sitgip er i We rn rill ai 48 
6 ni zleborm owes) (oe onthe hye agolg sab) 19 denariini on Joo Hyatt 


Ree YEA avigglomuig at herltatva heeds ashe 0 


'‘ r 
+, 


oe 


Acknowledgments 


I am first and above all eternally thankful to God Almighty, for it was with his will and 
guidance that I have completed this work, and he has bestowed infinite other blessings on 
me throughout my life. I also wish to express, my deep thanks and gratitude to the 
following people and institutions. 


My supervisor, Dr. Nelson Durdle, for his guidance and support during the course of my 
PhD program. Also, I am grateful to Dr. Hong Zhao for serving as a co-supervisor for this 
thesis. 


All my professors, for the major roles they played in my overall education. I would like to 
especially thank Dr. Dil Joseph and Dr. Mrinal Mandal from the U. of Alberta. Also, I am 
greatly indebted to Dr. Magdy Saeb for the role he played during my undergraduate and 
Master study. 


My collegues at the Medical Imaging Lab - Ahmed El]Safi, Lino Rameriz, and Peter 
Ajemba for their support and encouragement. 


All the wonderful people and friends whom I was honoured to meet over the past five 
years. 


The US National Library of Medicine for providing me with the spine x-ray images 
dataset. Also, thanks goes to Dr. Thomas Deserno at Aachen University of Technology, 
Germany, for providing me with the database for the medical image retrieval 
experiments. 


The Informatics Circle of Research Excellence (iCORE), and the Killam Trusts for their 
financial support of this work. 


All the administrative staff in the Electrical and Computer Engineering department at the 
U. of Alberta. 


Last, but not least, I am deeply indebted and thankful to my family for its continuous 
support and encouragement. 


5 a a 
(en ‘14 WV 
ro waits 


cua ai vel vide yee oy Dvloorr en badelind yigoohh 


wet aay 


sie. oo teL 0, ive ota ibs sealed wi 
J i| tt BAAR. Jey } 0m a 46. RTO, mustiod! WE OD 


aysiu aol ee don ia yeen aro 


ithe Ge eae grab VETS HOTELS Gt at “A 


Ry. 
0 oehiol Tyas be sonalioa sil yor ae 
uD 8 Ak GAP te Pons — 


“a 


1 wees hye on f y Meus Mi iy bowph ’ yf it alg 
JEM, 040 adhe fo oe! ‘eh ie pais 


cians On ED aA ee ish alin ai wwe 
1 yguaigaing Bite siege 


Sho taan 1 Gees: aro Pb phd » theese hla " 


ch Wt 2 


elie le Ane, ae A ier rr 7) ithe sill a Lees 


Table of Contents 


BN NEG En CEN, pais sscyacoppnkinrciceet seek Gacesneamnassneuacsasdnasi sindssnedwuesseipakacesse>aanononesoas 1 
WTR ORO. Fxi. sreetatit Sree, GRR OS ose ache aca ca ccateateadseasachvcaatorsasnecitarseeistasiaansieeecd | 
NETS Sy OTIS T 2 RT 2 Ser co 2 
PWR) Mas CULES LMI Wa LULU C31 RSS BF 19 (ea 0 a OC ve 3 
122 Commomyetiobral patholo c yirers 718 1506... cies concesececeneccocercesuesentaavvanconeserevereel 3 
Men NEMO RNAS OTTO ENHOUS 2.0505.) 52 cee .c<1tgnesssseicdere cvvexnceeeiciv ats vecacansaenavesacienieensacarsveced 6 
Be MOS ete Pie R eM a asa dude (edu nadauanetcccidan Wake pesacwaiiad 6 
Be. 2 Comtrti ions ee tee sete eee eee rede apse esavssnuctsvnsauaaeapiacactedesseeeesauaay’ 8 
EGE OSC ORG ON COST. SR RASA [oe AREAL eat eS 1] 
2. BACKGROUND AND RETA BED WORK roviscccccccecccicceleteosecvsescccsossseencess 12 
iy NERO LU) 61) | ten ep ee en re op 2-0 hu. ne = Poh da Sa Ant: See ION PZ 
Bee SNAP LE PLC Se MEAULONN nao ete oc cnnctcopea strate eer coaes eens et Mmera tc deena ott tear ects Ser exer rccntaceeeel eee 13 
Dens bat Ss Pime Based TepreSelitt Ont eres nee eee Son) inc, paar. en Me enp des vacedesekbedss 14 

PON 2 Pouries Deschiptors. Aa PES INC. SS EE eae. 15 
20.2 hondmarkehasedtepresentanon et. ye Raven Ree... de A WEN... oes. 15 
meni) Medial axis—paScdirepresc iit On ati ern tee tS ants eine ssa udece wardens 16 

ei de MAD ICIt LEVel-Set Genie emt AION cara eas eee oes cok ca cou stas Geese evens LF 

2-5 Deformatic Models tc wu chaeore an at cn ees lees hee noc ces ia secedevneseadneedensbes 17 
UAT ACHIVE COUtOUuts (ON aKCS iso ore, aes ener entre, Wee TE eee ty 18 
NSD. FI VEMOR IMEI IS 5 Ss ete cae 09) oh icity ere ener so oe ee i Ee Soda 20 
2.3.5 Statistical Shape sMOdels x22) eee se yee eee ne eA a oan Ser wnorky saat 21 

2 AB Wea ety Odio ye tet 5 a. ke Soa eee ee see ed cocoa ad ets tans 2) 
2-4 “oi StriDuliom NIGUE! She. setae ve dane eee ee eerie eed doer a gery 22 
2A 2 Rowan pearance oa yao c cacao vest MSc cats vessigs cases cessohas sesh duwen eden suaue-apsekssey: 23 
ZG ANGOGE SEAL hn PIOCECIANE. 0h orn Cen varrcta ress ose rteescoccaee ees eek make hare a vas Seats 24 
2, SEACt Vey ping ata eA NOG Mee teat tht 65 0205 ss 2as da fev pgs secantes 2: paarbucyege Seedsdyucoaasns ase: 25 
2 RST ACN Wee Moree 8h cs, 5 Ser ea 2 haf ohne cen ecois on day sue aes aes ckes 26 

2, SHOR Tie RU ele lie e een ee ce, sel 2 Oe bade, Praga Gai beer tet oo. cpa vacdk soo ee, -ndoutg yeicenseyeeee 26 
233 Uma cer sete ih a eae erected io ccna ceo csecenetsene an 5aeésachssvasecpeeeteuedss ees ookeca espe: 28 
2.6 Related s Wonk sekeds ered Pea: bi iiven deetore wnhit-Ceromeseea Ames BM AY. cob oes 28 
ZAI OLARESEIC AIG 5 IAC AV MOCO PRINS see ho oh 9k hans) San em ne ees atest 29 
4G Statistica) Appearance WOE INR ose cosa ob cn vnceby saps clpeemiennee awesne i tu ceisancsteens 30 
2.6.5 Model-based segmentation Methods. 25.2.2. o...20. e005 oc ccenssvatsdecyedeceeecesceunessecstsutes 32 


2.64 Seememationand analysis of vettebral structures .....-..:.....-:2...j2e0so00--esstenctans: 33 


ge! 


ap <7 44 


OF... 


, hn 


he kim ies ps 


tE. ere eer ey ) te 


iene ade 


boi its 


Lwe 0<404e08 phaobteeniere sear’ evle ie Te dis 
¥ 
‘ . ’ . . « sh phere dae ts 
” 4 eeeee ’ oer? 'e > 7=<7e7 
yo. 


BL Ye ; BY a 
=p tae ts ibe “pee aL per Writ 
kes A j ey Ae at vifecrmes presi e 
ree ee ee wo Bey DR gay te eee PIPE ard dt 


teeecete Ki mie inde Teg eW IAL On dawd de? {did ee 


, ae ; ai 
Per i, en ee (ida pe nenp cere Reap hoon aie Mae FRR aie « Mss so 4 . 
4 , i Mi a» a ey . +. 
: ya ® 1 pus ‘ a i a ; 


i j y ft -aee a a 
PP TITe TTI ta ee er 
fetes hi r) 
ie trp hrdee en i Ahnp evade ieee ment ibtnvsn hay a Amehs oe of aca - 


<2 ° + Vase yar h ks Se amie ib WerOee et DP eeee 


enn te pee . . 
Y 4 | > Liyiveer 
t q es 
‘ ‘ es steer oes 


so ro neh AACS ags yh ly soho 4s (& Rem’ 


5 ee > = 6 ae eee La ay tee eh eae 
h a. 


hh a ’ q 
men ey ee 
an Leh, eee vet bien OMe 


‘ ibs Viswaaell weir veil emal 
a se i 7 


vee pik 
ih : aa t 
%; 


1 4 ja i.) | ’ jie ha pry 


= | 


oa° 


Weerunerreaetiry arts tere re edo ear Cg 


eb 440d be Cai keen i eee, vides ee wy yh ats wh el ata 


aS ; a Fae 


otaeed Stas caseload 7) ‘wuabnecredaanah fides pens tata SO 
hi i i a ah 
i 4 J 


\ 
Piles i t vr 
Vind arsbes ecm rs tant baad ee aie ddd wa 


+ He 4 Sain) [aE Vea dT peed eee eyriehe ree nas 
} * iy, 
Cem rR een hme Wiehe + ten pend tie 


poet atareh vi Ay i) i tea doa Fy 


o aia 
Peeks ve AE 
Pai f mine t 


} 


3. MULTI-SCALE MEDICAL IMAGE ANALYSIS TOOLS: WAVELETS 
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1. Introduction 


1.1 Motivation 


Medical imaging is usually used to refer to a set of non-invasive techniques for probing 
the human body. The field of medical imaging dates back to the pioneer work of 
Professor Wilhem Ronteger and his seminal paper “On a new kind of Radiation.” Due to 
recent advances in imaging technologies, the scientific community has been challenged 
by the need to develop efficient, accurate and reproducible computer-aided methods to 
support medical experts in diagnostic and decision-making practices. The applications 
include early disease diagnosis, surgical planning, pathology-based image retrieval, and 


disease propagation and prediction. 


A growing demand exists within the medical imaging community to develop computer- 
based methods that can achieve some sort of image interpretation and infer relevant 
quantitative measures. Image interpretation, always the ultimate goal of research in 
computer vision, includes the integration of a number of low-level and high-level tasks 


such as object detection, segmentation, recognition, registration and inferring. 


In an attempt to achieve this long-standing goal, it seems logical to try to mimic the 
human visual system, which is the most amazing known image interpretation and analysis 
system. Researchers in cognitive studies believe that human vision is an active process. 
Our vision of the world is constructed from a combination of the retinal image and the 
prior knowledge we have about a scene. Traditionally, computer scientists have viewed 
the vision process as cascaded blocks of tasks such as registration, segmentation, and 
recognition. However, recent research efforts have indicated that this perception appears 
to be an over-simplification of the vision process. It is now believed that our visual 
system performs all these tasks in a simultaneous fashion that benefits from both low- 
level cues and high-level prior knowledge. Within the medical imaging community, this 
belief has increased the importance of prior knowledge-based models in various medical 


image analysis tasks. 


This thesis is motivated mainly by two medical imaging tasks: the segmentation and 


analysis of non-rigid anatomical structures. We aim at developing model-based methods 
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for detecting deformations in medical images, and establishing relationship between these 
deformations and the various possible pathologies. Within a medical context, 
segmentation is the process of detecting anatomical structures in a complex image scene. 
Unlike man-made objects, anatomical structures are complex and subject to a large 
amount of variability. Statistical shape analysis is the process of locating changes 
between healthy and pathological structures. In this thesis, we are particularly interested 
in developing model-based methods to support practices related to the imaging of the 
human spine. Medical experts often examine hundreds of spine x-rays to look for various 
pathologies. Common pathologies include anterior osteophytes, disc space narrowing, 
and fractures. Detecting these pathologies normally requires the radiologist to perform the 
following tasks: locate the target structure in the image, accurately outline its edges, and 
employ prior knowledge to extract cues about the existence of a particular pathology. 


This process is both error-prone and time-consuming. 


Thus, the work presented in this thesis is motivated by the need to develop new model- 
based methods to support various diagnostic practices conducted by medical experts. 


Inspired by studies about our human visual system, we aim to achieve the following 


goals: 
- Developing novel statistical models that accurately capture prior knowledge 
about the shape and texture appearance of anatomical structures. 
- Developing model-based methods for two spine-related tasks: the segmentation, 
and analysis of vertebral structures in medical images. 
- Using the developed models of shape and texture to synthesize medical images 
with pathology-related deformations. 
1.2 Medical Context 


Throughout this work, we are interested in developing efficient statistical models of 
shape and appearance. We apply these models for the segmentation and analysis of 
vertebral structure in x-ray images to extract clinically relevant information. This section 


presents a brief overview of the medical context of the work presented in this thesis. 
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1.2.1 Anatomy of human spine 
The human spine is made up of a number of vertebral bodies and joints referred to as 


inter-vertebral disks. Figure 1.1 shows the anatomy of the human spine. The spinal 
column is made up of a total of 33 vertebrae, separated by 23 inter-vertebral disks. It is 
divided into several regions: the cervical, thoracic, lumbar, sacrum, and coccyx spine. 
The vertebra is a box-like bony structure that protects the spinal cord. The inter-vertebral 
disk acts as a shock absorber between adjacent vertebrae. It helps minimize the 
movement between adjacent vertebral bodies, while allowing for more flexibility for the 


whole spine [1]. 


Vertebral Body Transverse Process 


Sacnmm spine 


Coccyx spine 


{a) 


Figure 1.1 Anatomy of spine. (a) Lateral view of spine segments: Cervical, Thoracic, Lumbar, Sacrum, 
and Coccyx, (b) Lateral anatomy of a vertebra [1] 


1.2.2 Common vertebral pathology 

Medical experts often examine hundreds of spine x-ray images to look for various 
pathologies. Common pathologies of interests include osteophytes, disc space narrowing, 
and vertebral fractures. Figure 1.2 shows examples of x-ray images with different 


pathologies. 
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Figure 1.2 Lateral Radiographs for human spine with various pathologies (a ) Lumbar spine with mild 
fracture in L3, (b ) Lumbar spine with wedge fracture in L3, and crush in L2, (c ) spine vertebrae 
with various fractures( e ) Lumbar spine with osteophytes in L1,L2,L3, ( f ) Zoom-in showing disc 
space narrowing example [108] 


Osteophyte pathology refers to bony growths on vertebral corners. Figure 1.2 (d) shows 
an example of the anterior osteophyte on lumbar vertebrae. Disc space narrowing, another 
common pathology in spine x-rays, refers to the change in inter-vertebral discs 
(degeneration). Numerous factors can contribute to this pathology, including age, gender, 
and injury. Disc space narrowing can be accompanied by back pain and nerve root 
compressions. Figure 1.2 (e) shows an example of vertebral structures with disc space 
narrowing pathology. Both osteophytes and disc space narrowing are often used as 
evidence of osteoarthritis disease. Osteoarthritis (degenerative joint disease) refers to the 
deterioration of joints in the body due to age, injury, or other diseases. Statistics show 


over 27 million patients have this disease in the United States [2]. 


Vertebral fracture is another important pathology that occurs due to a weakened vertebral 
body. Fractures are often detected by assessing the inclination of the vertebral endplates. 
Vertebral fracture is a continuous process requiring early diagnosis. A mild vertebral 


fracture is the most difficult to detect and appears as a small concave depression of the 
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endplate. A fracture can then progress to a wedge, which is a reduction in the anterior 
height of the vertebral body. A more severe fracture is the vertebral crush, which causes 
the vertebral body to start to lose its height. Figures 1.2(a) and 1.2(b) show examples of 
x-ray images with different types of fractures. Traditionally, fractures are assessed by 
using vertebral morphometry. A six-point representation of the vertebral body is used to 
conduct measures such as Anterior Height, Middle Height and Posterior Height [3]. In 
[4], Genant et al. proposed a semi-quantitative fracture grading system to assess the type 
and severity of vertebral fracture. Vertebral fractures are often used as evidence for 
Osteoporosis disease, characterized by low bone mass and structure deterioration of bone 
tissue, which result in fragile bones. According to [5], over 2 million of the vertebral 
fractures in the United States are due to Ostereoposis. It affects one in every two women 


and one in every four men over the age of fifty. 
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Figure 1.3 The Genant Semi-quantitative grading system [4] 
Scoliosis is another disease often detected through assessment of vertebral morphometry 
and spine deformity. Scoliosis is a condition that causes lateral displacement of the 
vertebrae, resulting in curvature and rotation of the whole spine. It affects between 2-4 % 
of adolescents [6]. Scoliosis causes inclinations to the whole spine, as well as wedging to 
the individual vertebrae. Scoliosis is usually assessed by using the Cobb and wedge 
angles. The Cobb angle is obtained from the Posterior-Anterior Radiograph of the spine 
[7]. Vertebral and inter-vertebral wedge angles are two other famous indicators of 
Scoliosis [8]. The vertebral wedge angle is defined as the angular difference between the 


lower and upper endplates of the vertebral body. The inter-vertebral wedge angle is the 
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angular difference between the lower endplates of the top vertebra and the upper endplate 


of the bottom vertebra. 


1.3 Challenges and Contributions 
1.3.1 Challenges 
This research deals with two challenging problems in medical imaging: 
he Efficient statistical modeling of the shape and appearance variations of 
anatomical structures. 
He Application of these models in high-level spine-related tasks: the 
segmentation and analysis of vertebral structures. 
Our work focuses on challenges such as those caused by the complexity of the anatomical 
structures, the wide range of variability that can take place, and the scarcity of ground 


truth data. The problems addressed in this thesis are outlined below. 


Shape representation and modeling of anatomical structures 

Statistical shape models try to faithfully represent the full range of biological variations in 
anatomical structures. However, the accurate representation and modeling of anatomical 
structures is a difficult task. 

- Anatomical structures are complex and exhibit a wide range of local and global 
variations at various scales and frequencies. Thus, a rich representation of anatomical 
shape is crucial for the construction of efficient statistical shape models. This requirement 
is particularly true for articulated structures such as the human spine. Medical experts 
often examine hundreds of spine x-rays to determine the existence of various vertebral 
pathologies. Since most pathologies tend to be localized rather than global, the challenge 
is to find a suitable low-dimensional representation that can still capture subtle 
pathology-related information. This kind of representation would allow for the 


construction of better non-uniform shape models. 


- An important application for shape priors is to constrain ill-posed problems 
such as segmentation and registration. Due to the limited availability of training samples 
and the complexity of anatomical shapes, statistical learning in the high-dimensional 
shape space constitutes an important challenge. Most shape priors suffer from either 
being too localized or too global. A localized shape prior uses only smoothness to 


constrain the solution. However, due to the uncertainty about the appearance of structures 
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in medical images, such priors might result in nonplausible solutions. On the other hand, 
an overly-constrained global prior fails to generalize well to cases unseen during the 


training phase. In this thesis, we focus on multi-scale methods to address this problem. 


- We also focus on the clinical relevance of the model. The interpretability of 
model parameters has recently drawn much attention within the medical imaging 
community. This issue involves the ability of the model to extract modes of variations, 


which are physically intuitive and pathology related. 


Statistical shape analysis in medical imaging 

Statistical shape analysis refers to the ability to understand how anatomical structures 
tend to vary due to reasons such as the development of certain pathology, aging, or 
surgery. The medical imaging research community is constantly concerned with 
enhancing its ability to distinguish between the anatomy of healthy individuals and those 
with pathology. This enhancement would allow for the early detection and diagnosis of 
diseases. Moreover, it also opens the door for a wide range of applications such as 
surgical planning and pathology-related information retrieval. The efficient analysis and 
classification of anatomical structures continues to be a formidable task because such 
structures exhibit a wide range of inter-subject variability and structural differences 
among different pathologies. Traditionally, vertebral shape analysis has been performed 
by using a few points that outline vertebral bodies. Currently, shape-based methods are 
increasingly being used for better quantification of vertebral deformities. The challenge is 
to be able to develop robust shape analysis and classification methods with rich 


representations that can capture subtle pathology-related information. 


Segmentation of vertebral structures in x-ray images 

Segmentation of anatomical structures in medical images is a requirement for various 
applications such as computer-aided diagnosis, 3D visualization, and surgical planning. 
Computer-assisted segmentation methods aim at reducing the time and cost of the error- 
prone manual segmentation process. In this work, we are particularly interested in the 
segmentation of human vertebrae in digital radiographs. Accurate vertebrae segmentation 
plays a vital role in the proper assessment of various vertebral abnormalities. The goal is 
to develop efficient methods that can cope with various conditions such as pathology, 


weak edges, and noisy images. Developing such methods continues to be a difficult task 
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because of (i) poor image quality and noisy and incomplete data that affect the ability of 
low-level imaging procedures to accurately extract cues about the boundaries of the 
vertebral body; (i1) the complexity of the anatomical structure of the human spine and the 
wide range of possible variations, which tend to limit the ability to generalize to unseen 
examples; and (111) the scarcity of training data, which also limits the performance of the 


segmentation method. 


Modeling of texture appearance of anatomical structures 

Appearance models attempt to capture the general texture properties of the object of 
interest. They are important for the interpretation of unseen images and the synthesis of 
new images. Traditionally, texture appearance models have been constructed by using a 
holistic Principle Component Analysis (PCA) basis extracted from a raw intensity pixel 
representation. Despite their potential benefits, traditional appearance models face a 
number of limitations. First, the scarcity of the training data tends to limit the model’s 
accuracy due to the high dimensionality of the texture space. Second, modeling raw pixel 
intensities in texture appearance models becomes computationally infeasible in case of 
high-resolution images. This problem has led to a number of research efforts to develop 
enhanced appearance models. The challenges in texture appearance modeling include (1) 
the need to model the texture appearance in a domain that achieves high reduction rates 
while preserving higher-order details in the image such as boundaries and (ii) the fact that 
most appearance models use PCA for statistical modeling. Their use of PCA results in 
global texture modes of variations. However, most pathology is often related to local 


variations in both texture and shape. 


1.3.2 Contributions 
This thesis focuses on developing novel statistical models of shape and appearance to 


improve the support of various medical practices. We are concerned mainly with 
applying these models for the accurate characterization and segmentation of vertebral 
structures in x-ray images. This application is relevant for a wide range of clinical 
activities such as the diagnosis of various spine-related diseases (such as Osteoporosis, 
Osteoarthritis), surgical planning, and population studies of disease progression. Within 
this context, this thesis makes contributions in the following areas: (1) The construction of 
a descriptive multi-scale statistical shape model, (ii) efficient vertebral shape analysis and 
classification schemes, (iii) accurate prior-based segmentation of vertebral structures, and 


(iv) enhanced texture appearance modeling. 
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First, we attempt to tackle some of the challenges associated with modeling high- 
dimensional shape space by using a small training set. We aim at constructing an accurate 
shape model that captures both local and global pathology-related modes of variations by 
using a limited number of training samples. We propose a novel multi-scale statistical 
shape model that uses the concepts of sparsity, dimension reduction, and statistical 
independence to extract sets of multi-scale clinicaliy relevant modes of deformations. 
Within a best basis selection framework, the proposed method benefits from the multi- 
scale and sparsifying nature of wavelet packets and the ability of Independent Component 
Analysis (ICA) to capture higher-order statistics. This method allows for the construction 
of a localized non-linear shape model with clinically-relevant modes of deformation. Our 
results have demonstrated that the new model outperforms the Active Shape Model and 


other related models in the literature, even in the presence of noise. 


Second, we present a novel framework for the analysis and classification of vertebral 
structures by using pathology-related Wavelet-ICA shape parameters. For vertebral shape 
analysis, we use the parameters of the multi-scale shape model. Using statistical learning 
techniques, we are able to establish a statistically relevant relationship between the 
Wavelet-ICA modes of deformations and clinical information. Our results illustrate that 
significant differences were found between groups of healthy and pathological vertebrae. 
This finding allows for the anatomically meaningful interpretation of the multi-scale 
modes of variations. Furthermore, we propose a novel method for multi-vertebrae 
classification using Wavelet-ICA parameters and the Fisher classifier. We demonstrate 
the ability of the proposed framework to localize shape changes in vertebral structures 
related to a specific pathology. Thanks to the locality of the features, the new 
classification scheme can assess the condition of several vertebrae simultaneously. Our 
results show that the proposed framework achieves a better performance than that of 


analysis using the PCA-based representation of shapes. 


Third, in an attempt to bring recent advances in computational harmonic analysis to the 
field of medical imaging, we present a novel vertebral segmentation algorithm using 
Contourlet-based local appearance profiles and a multi-scale shape prior. The new 
segmentation method aims to overcome challenging segmentation problems such as weak 
edges, noisy images, and the wide range of anatomical variations. At the core of the 


segmentation algorithm is a novel Contourlet-based salient point detection and matching 
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method. We employ the Non Subsampled Contourlet Transform (NSCT) to extract 
salient points of the vertebral body and build local appearance profiles for these points. 
Segmentation is then achieved via shape-driven matching of the salient points in the 
Contourlet domain. At a low level, the new salient point detector is used to detect 
evidence of the object of interest in the test image. At a high level, Contourlet-based local 
appearance profiles and the multi-scale shape prior are used to drive the segmentation 
process. Compared to Active Shape Models (ASM) [14] and related work in the 
literature, the proposed segmentation method achieved a high performance in terms of 
robustness to noisy images and generalization to unseen cases. We have also used the 
proposed Contourlet-based salient point detection algorithm in a Content-based medical 
image retrieval task. The proposed method benefits from the robustness and stability of 
the extracted salient points, and the information-rich Contourlet descriptors to retrieve 
images of similar classes. Promising results were obtained when applied to the IRMA 
(Image Retrieval in Medical Applications) database [125]. This work represents the first 
contribution from the Medical group at University of Alberta to the IRMA project. 


Fourth, we tackle challenges related to the accurate modeling of texture and the sacristy 
of training data. We present a new sparse texture appearance model (CTICA-AM) based 
upon Contourlet Transform and Independent Component Analysis (ICA). The new model 
benefits from the non-linear approximation power of Contourlets to achieve high 
reduction rates. The statistical independence of Independent Component Analysis (ICA) 
is employed to capture localized texture variations. The proposed Contourlet-enhanced 
Texture Appearance Model is shown to outperform related appearance models in the 
literature, especially at high reduction rates. We also describe a general framework that 
integrates developed shape and texture models in a unified framework for the synthesis of 


photo-realistic pathological x-ray images. 
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1.4 Thesis Organization 
The rest of the thesis is organized as follows: 


Chapter 2 presents some of the background material essential for understanding the 
methods developed in this thesis. The chapter also provides a brief review of the most 


relevant work in the literature. 


Chapter 3 presents an overview of the recent advances in multi-scale image analysis 
tools. The chapter also discusses the suitability of various multi-scale tools for medical 


image analysis and interpretation. 


Chapter 4 presents the details for a novel multi-scale statistical shape model. 
Experimental results are also provided for the quantitative and qualitative evaluation of 


the new shape model. 


Chapter 5 presents the details for a new vertebral shape analysis and classification 


framework. Experimental results are provided for the evaluation of the proposed method. 


Chapter 6 presents details for a new salient point detection and matching algorithm. The 


chapter also presents novel vertebrae segmentation and retrieval methods. 
Chapter 7 presents a new Contourlet-enhanced appearance model. The chapter also 
presents a general framework for the synthesis of realistic spine x-rays with various 


pathological deformations. 


Chapter 8 summarizes the work presented in this thesis. The chapter also discusses some 


of the remaining challenges and recommendations for future work. 
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2. Background and Related Work 


2.1 Introduction 
Decades of research in computer vision and cognitive studies have led to a gradual shift 


from “bottom-up” to “top-down or “model-based” strategies for image analysis and 
interpretation. In “bottom-up” strategies, images are analyzed by using pure low-level 
procedures by looking for local structures such as edges, contours, and regions. The 
evidence is then grouped together and used to identify objects of interest. While this 
approach might be able to achieve limited success in controlled environments, it has 
proven to be difficult to use and prone to failure in more challenging situations such as 
medical images. On the other hand, “top-down” strategies tend to use prior-knowledge 
about what is expected in the image. Models of prior-knowledge are matched to the data 
in the test image. Model-based vision systems often employ a combination of bottom-up 
and top-down approaches. Over the past decades, model-based vision methods have been 
successively applied to various image analysis and computer vision tasks [9]. Specefity 
and generalizability are the two major challenges in any model-based vision system, 
particularly in medical image analysis, because of the noise in the images and the 


complexity of the anatomical structures of interest. 


The segmentation of anatomical structures is a fundamental step in almost any medical 
image analysis task. Segmentation is the process of outlining the anatomical structure of 
interest for further processing. Unlike man-made objects, anatomical structures are 
complex and subject to a wide range of variability. In an attempt to address these 
challenges, deformable models have been introduced and extensively used for the 
segmentation and analysis of anatomical structures. Deformable models attempt to fit a 
new image by applying constraints on the topology of the structure’s boundary. The first 
deformable model was introduced in 1987 by Kass et al. [28]. Over the past two decades, 


many extensions and modifications have been proposed in the literature. 


This thesis is motivated mainly by the desire to perform two imaging tasks: the 
segmentation and quantification of anatomical structure. We are also interested in 
establishing the relationship between deformations and various pathologies. The rest of 
this chapter presents some of the background material essential for understanding the 


methods developed in this thesis. We start with an overview of the basics of shape 
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representation in medical images. In section 2.3, we present a brief review of various 
deformable models. In sections 2.4 and 2.5, we present the basics of the Active Shape 
Model (ASM) and Active Appearance Model. Section 2.6 discusses some of the most 


relevant work in the literature. 


2.2 Shape representation 


Shape representation plays a critical role in the success of higher-level imaging tasks. In 


[10], Kendall provides a definition of shapes as follows: 


Shape: Two geometric objects have the same shape, if one can be mapped onto the other 


ne by a similarity transformation. Similarity transformation includes translation, 


otation, and scaling. 


Figure 2.1 Example of 2D shape under different similarity transformations. 


The medical imaging literature contains a wealth of methods for efficient shape 
representation that support a variety of higher level tasks such as segmentation, analysis 
and modeling. Generally, shape representation methods can be classified into two 
categories: parametric and non-parametric. Examples of parametric shape representations 
are spline-based representations [11] and Fourier descriptors [12]. Examples of non- 
parametric shape representations are landmark-based methods [13, 14], medial axis 


representation [15], and implicit level set representation [16]. 


2.2.1 Parametric representation 
This section presents the basics of parametric shape representation. In chapter 4, we 


present a novel multi-scale parametric shape representation by using a best-basis 
selection framework. A parametric shape describes the shape with a set of parameters. In 


general, a parametric shape descriptor represents the boundary of a contour by using a set 
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of ordered control points. The control points are parameterized by a parameter (s ) that 


varies from 0 to 1: 
C(s):[0,1] > R? 41} 


C(s) =[2(s), y(s)] . 
In its simplest form, a closed curve of control points is represented by using a polygonal 
approximation with the control points at the vertices of the polygon and connected by 


straight lines. 


2.2.1.1 Spline-based representation 

Spline-based representation offers a smoother representation of the contour in comparison 
to the polygonal approximation, as shown in Figure 2.2. In B-spline shape 
representation, a piece-wise polygonal curve is represented as a linear combination of 


control spline basis functions: 


Eat N-1 
Cs) = Os), (2.3) 


n=0 
where : 


Q, : Coefficients of the representation 


Y,, : Bases functions of the spline 


> 
C(s) is a parameterized curve with parameter s . 


Spline shape representation decomposes the shape into a combination of spatially 


localized polynomial basis functions. 


aastecs inal 


(a) (b) 


Figure 2.2 Parametric shape representation using (a) polygonal approximation, (b) B-Spline 
approximation 
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2.2.1.2 Fourier Descriptor 
Another famous parametric shape representation is the Fourier descriptor [12], which 


represents a contour by using a linear combination of Fourier bases. 


Let C(s)=[0,1]— R* be a 2D continuous curve. The Fourier descriptor is then 


calculated as follows: 


C(s) mn Dg lett (2.4) 


L =? 
Or, == [ C(syet"" as, (25) 
Ly 


where @, is the Fourier coefficient. 


In medical imaging, Fourier descriptors have been used to represent anatomical 
structures. In [12], Staib et al. used Fourier descriptors to construct a shape prior for 
image segmentation. In [17], the authors described a Fourier-based shape prior to 
constrain deformations in deformable Active Contours known as Snakes. In [18], 


Ghassan et al. described a statistical shape prior based upon Discrete Cosine Transform. 


2.2.2 Non-Parametric representation 


2.2.2.1 Landmark-based representation 

A landmark-based representation describes the contour of the object of interest by using a 
finite set of points on the boundary which are referred to as landmarks. Mathematically, 
an n-point shape in d-dimensional space is represented by concatenating each 


dimension into a dn -dimension vector. Thus, a 2D shape is represented as 


it 
Risin pi, ste V, dpe Al z. (2.6) 


Where x, and y, are the x- and y- coordinates of the i” point. 


Figure 2.3 shows the outline of several brain structures by using landmark-based 


representation. The landmarks are most commonly placed manually by experts who can 
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utilize their knowledge of anatomy and potential pathologies. Generally, landmarks can 


be either anatomical, mathematical, or pseudo landmarks: 


(1) Anatomical landmarks: Correspond to biologically relevant points. 
(11) Mathematical landmarks: Correspond to mathematical or geometric property 
of boundary. 


(111) | Pseudo landmarks: Used to obtain dense representation of shape by placing 


landmarks at equally spaced distances between other landmarks. 


Landmark-based shape representation is used in the Active Shape Models (ASM) 
proposed by Cootes et al. [12]. Statistical shape modeling using landmark-based 
representation requires establishing the correspondence among the landmarks in the 
whole training set. Within this context, correspondence means that each landmark is 
placed at the same anatomical location over the whole training set. Several studies have 
suggested new methods to automate the process of identifying the landmarks and 


establishing the correspondence among them [19, 20]. 


Figure 2.3 Outline of brain structure by using landmark-based representation (from [21]) 


2.2.2.2 Medial axis—based representation 

According to [15], the medial axis of a shape is defined as a set of the singularity points 
of the distance transform that lies inside the shape of interest. Medial-axis representation 
describes the skeleton of the object of interest. Recent physiological studies have 
indicated that medial profiles play an important role in the Human Visual System (HVS). 
In the field of medical imaging, medial axis representation has been successfully used for 
segmentation and analysis tasks [22, 33]. Figure 2.4 shows a typical simulated corpus 


callosum and its corresponding medial axis representation. 
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Figure 2.4 (a) Simulated corpus callosum image, (b) medial-axis representation (From [22]) 


2.2.2.3 Implicit Level-Set representation 
Level-set is a method for shape representation that implicitly accounts for a change in 
topology. A level-set representation describes the shape as the zero-level of a higher 


dimensional function. Mathematically, a level-set is described as follows: 


A 2D curve C(s)=[0,1]—> R’ is represented as being embedded as zero-level of a 


specific function in R*® whose height is sampled at regular intervals on the xy grid [16]. 


Level sets are typically used in medical image segmentation [24]. In [25], Leventon et al. 


suggested adding shape priors to level sets for better segmentation accuracy. 
p 
4 


Figure 2.5 Implicit Level set representation. A curve is given by a set of points onto the level set 
function @ 


2.3 Deformable Models 


Formally, a deformable model can be defined as follows [26]: 


eformable model: A model which under an implicit or explicit optimization criterion 


deforms a shape to match a known object in a given image. 
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In general, deformable models use prior knowledge about the object to facilitate the 
extraction of the object boundary. In [27], Jain et al. classified deformable models as 
being either free-form or parametric. Free-form models employ local shape constraints 
(e.g., smoothness) as the only prior knowledge. On the other hand, parametric models use 
global shape constraints. Figure 2.6 summarizes Jain’s classification of the deformable 


models. 


Figure 2.6 Classification of deformable models 


2.3.1 Active Contours (Snakes) 
In [28], Kass et al. introduced the most popular deformable model known as Active 


Contours (or Snakes). Snakes are free-form deformable curves that move freely under the 
influence of a combination of forces: (i) an internal force that imposes local constraints 
such as smoothness and (1i) an external force that encourages movements towards desired 


image features such as edges. 


Mathematically, Snakes are formulated as an energy-minimization problem [29]. A Snake 
2 
is a parameterized contour embedded in the image plane Ci y) eR’ Let 
i 
v(s) = [x(s), y(s)] represents the outline of the contour, where x(s), and y(s) are 


the coordinate functions, and se€[0,1] is the parametric domain. The shape of the 


contour is defined by the energy function E(v): 


E(v)=S(v)+ PQ), Can 
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where S(v) represents the energy of the contour, and P(v) represents the energy that 


couples the Snake to the image. The deformation of the contour is characterized by the 
following equation: 
2 


ds, (2.8) 


2 
+ w, (Ss) 


Os 


ov O’v 
S(v)= a 
(v) J (s) >, _ 


where S(v) characterizes the deformation of the contour, w,(s) controls the tension of 


the contour, and w,(s) determines its rigidity. The potential energy is given by the 


following equation: 
1 
P(v)= | W(v(s))ds. (2.9) 
0 


P(v) is a potential energy function that couples the contour to the image plane. P(v) is 
usually designed to have local minima coincide with the intensity extrema in the image 
plane. A Snake with a potential function ‘Y(x, y) = —c|VIG, xe y)]| will be attracted 


to the intensity edges in the image. 


The final shape of the contour corresponds to the minimum of equation 2.7. Using 
Calculus of variations, the contour, v(s), that minimizes equation 2.7 must satisfy the 


Euler-Lagrange equation: 


O ave «hes dv 
— 5-0, 5 +5 G (0m, 5) +t WP(W(s,8)) =0. (2.10) 


Equation 2.10 represents the balance between the internal and external forces when the 
contour rests at equilibrium. Figure 2.7 shows an example of applying Snakes to a 
segmentation problem. Generally, free-form models have no global shape constraints. 
Thus, they are best suited to aid in manual segmentation tasks. It is now accepted that 
free-form models suffer from a number of limitations. First, they need to be initialized 
close to the structure of interest. Second, they use only the smoothness constraints of the 
contour, and this limitation can result in non-plausible shapes. Over the past two decades, 
many improvements have been suggested to the original Snake model. In [30], Cohen 
suggested using an internal inflation force in the Snake model to increase its sensitivity to 
spurious edges. In [31], Xu et al. proposed a Gradient vector field-based snake. A multi- 


scale formulation of the snake model was introduced in [32]. In this formulation, the 
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Snake is allowed to come to equilibrium on a very low-scale energy function and then 
slowly increases the scale. In an attempt to deal with changes in topology, Malladi et al. 
proposed a deformable model using level sets [11]. In [18], Hamarneh et al. proposed to 


embed both local and global shape constraints within the free-form model. 


Figure 2.7 Example of Snake-based segmentation ( a) Intensity CT image of heart left ventricle (b) edge 
detected image (c) initial contour (d-f)deformable contour moving toward left ventricle boundary 
(from [29]) 


2.3.2 Physical models 
Physical deformable models are a class of deformable models in which a single example 


(prototype) is used to build the parametric model. The prototype is allowed to vary 
according to some physical model. Park et al. [33] and Pentland et al. [34] presented a 
model using the Finite Element method. Variations are defined as the vibration modes of 
the model. Figure 2.8 shows examples of the Finite Element modes of vibrations for a 
square and a lumbar spine. A common criticism of physical models in medical imaging is 
that nothing guarantees that the generated modes of variations will correspond to 


plausible deformations. 
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Figure 2.8 Modes of vibrations of Finite Element model for a square shape 

2.3.3 Statistical shape models 

Statistical shape models aim at capturing acceptable patterns of variations in a class of 
objects by using a training set of examples. In [12], Staib and Duncan suggested using 
elliptic Fourier descriptors to represent the contour of the object. The Fourier coefficients 
are trained by using a Gaussian distribution as the prior probability. The optimal estimate 
of the boundary is then derived by using the Bayesian decision rule. This model 
represents one of the earliest statistical shape models in medical imaging. However, using 
the Fourier basis tends to limit the model’s ability to capture localized variations. Cootes 
et al. [35] presented the seminal work involving the Active Shape Models (ASMs). The 
Active Shape Model is a statistically-based parametric deformable model that is used to 
outline objects of interest in the image. The model is built up from a representative 
average shape and a weighted sum of modes of variations learnt during the training 
phase. Segmentation is achieved by allowing the model to deform only in a way implied 


by the training set. 


In the subsequent sections, we present the details of the Active Shape Models and the 


Active Appearance Models. 


2.4 Active Shape Models 
The Active Shape Model (ASM) was introduced by Cootes et al. in [35]. A detailed 


description of the model and its applications was later presented in [36]. The Active 
Shape Model is a deformable model that globally constrains deformations to valid 


instances of the object of interest. This global shape constraint tends to improve the 
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specefity of the model and to provide better segmentation. The ASM algorithm has three 
main parts: (i) The point distribution model (PDM), (ii) The local appearance model, and 


(111) the model search procedure. 


2.4.1 Point Distribution Model 
The contour of the object is represented by using a fixed number of labelled 


landmarks P; : 


PP= (CAO OTH TAN . (2.11) 


Prior to modeling the shape variations in the training set, they are aligned with respect to 
translations, rotations, and scaling by using the Procrustes analysis method. The 
following summarizes the alignment process [38]: 

i. Choose an example shape as the mean shape. 

ii. Align all shapes to the mean. 

ili. Re-calculate the estimate of the mean by using the aligned shapes. 

iv. In case the new mean estimate differs significantly from the other one, repeat 


from step (ii). 


Non-rigid variations are then modelled by using the Point Distribution Model (PDM). 
The PDM uses Principle Component Analysis (PCA) to capture the second-order 
statistics in the training set. The Point Distribution Model represents a shape as a mean 
shape plus a weighted sum of basis functions defined as the eigenvectors of the 


covariance matrix. Mathematically, the PDM is defined as follows. 


The mean shape is 


i (2.12) 


The Covariance Matrix is 


ers ox: ie Faye Rat gary (2.13) 


The eigenvectors Y, and eigenvalues A, of S are then computed: 


SQ, =1.%. (2.14) 


22 


patent! to yeti ewe 


, h es i ae ee | 
cits Wake ac a es 


a 


‘ ~, 


YM Soucy (ive Dergtha ote ont ISA eae ey ot nt saith’ + slide 
#1 boron vlayvienn eeu Pts eine ft el? spina oe savin 
| [Re aato incre 2 a me | 
pgor Le Sata «cath oieecn tl cn a | 
A tage Pa) or tis gas 
yecparle banal oat atileat io desta. an. vo roatises 9 a 
atin: 2d. csdio aa iat linens Asti ened se wan aaah ; 


MTA Soha natn habe ct, diel sida Sat aah fet Act tai! 
yoked al att sshqud i, (Ay, ager Hema) al an et 

hear ik oie ik Aopeaee GAR tac ey ‘ea gatita 
wile oe norobies p Sily a bn Peoitgoud on 90 | “ote edie: 


AW an) te baittlab Ai ais eh aids 7 


we mi Ae gus iy 
| tety ie 


tata aie re Cour 


baiuninas oa 1 ais * ton ! 


(HS) . Z ah ne 


VAS 
Tt . 


The matrix of the eigenvectors @ = LQ, Q)..... Py | represents the modes of variations 


of the shapes. Usually, variations in the training set can be explained by using a small 


number of modes (1.e., eigenvectors). Thus, we can obtain a good estimate of the shape 


by retaining the eigenvectors @, that correspond to the f largest eigenvalues: 


x= x+@b (2.15) 


~=[Q, Q,.....Q], (2.16) 


where b =[@, @,....0,]’ is the vector of model parameters. The variance of @, over the 


training set can be shown to be equal to the eigenvalue As . Thus, we can obtain plausible 


shapes by restricting the model parameters to vary within a range of tm,| A, . Figure 


2.9 shows two modes of variations for a Point Distribution Model of hand shapes. 
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Figure 2.9 First two modes of variations of PDM model for hand shapes 


ae 


2.4.2 Local appearance model 
During the training phase, a local appearance profile is constructed for every landmark 


point of the contour of interest. Given a training set of N p images, the local appearance 
profile is constructed as follows: 


For every image in the training set (I j > we extract a gray level profile $j for every 


landmark point Je of the contour of interest. The grey level profile is a vector of 
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intensity values sampled from the linear profile passing by the landmark and 


perpendicular to the shape: 
a T 
Bij — [8ij0, 8:1, are Sin] , (AT) 


where gj, is a profile of the i” landmark point extracted from the image if j using N 


samples. 
In order to obtain invariance to uniform scaling, we obtain the normalized derivative of 


the grey level profile: 


dg. 
Vii — W2 = (2.18) 
dg, 
k=0 
3 T 
085 = [841 — 8i028ip Sa Son — Sano) (2.19) 


where y,, 1s the normalized derivative of the grey level profile, and dg; is the derivative 


of the grey level profile. 


Given a set of NV p normalized derivatives of the grey level profiles, the local appearance 


profile for the i” landmark point is then defined as the mean profile y. and the 


covariance matrix S, : 


— 1 (2.20) 
yi =—)y, 
Ne j= : 
LAs? aia ied 
S; arco a tah MY - y; ee (2.21) 
p J>\ 


2.4.3 Model search procedure 
Segmentation is then achieved by fitting the learnt model to a new image by estimating 


the model parameters (b ) and the pose parameters ( f,, tS, @ ) that best fit the image: 
x=T , .,(x+@b). (2.22) 
T, , ,. is a function that translates the shape by (¢,,¢,), rotates it by @, and scales it by 


Ss. We start with an approximate pose estimate (tf s,,9,), and the model mean 


Xq fy 2 


24 


Ame ty 


bw aiaetins! ott ye gilieng ailetg ges aM 


. v¢ me ae i Ry Del Aa 
{ | © y Ey ii “re yg aust , 


4 
Vi 41 Oa Sit wdereye- eal (ert Lato ie 


i avpeviveb bextincnon snipe: aN giileae hn y 


1 C ", wl? ~~ ae i. rae noe 
Surtevrion ot) 7 oop Gor 4 thidag lav a oes ans ie vad ttn 
+ ip ont 

SST! Hi 20i on Eyal yas ity te nvomeatesioems Mk 


' - iy 
ek ay — F 


oid, bine iy storks nguba ott. an batilob ryallyy at i a wma 


if . a: 


1 


(05.5) ‘a 


cry | (i “are to 
GIT pen rey id Solan wary rw isto erat am aeiut vet hovaigion scoala 


SAME. ms iad wh (Bi, Ma aumini ni, ous baw ( din 


(ce e} 7 if Ady x Pee aa ay ss ee 


% oe, i yey 


Wd 31 ealnay bins, ord af abet, ( ds added — vos at 
| eae 
FLU OOEY, Taber ett) bng dite 4 ») in iy Pee 


4 uk 


KE 


shape ( X ). The pose and model parameters are then estimated iteratively so that the 


model best fits the new image. The search iteration consists of the following two tasks: 


(i) For each landmark point, examine a search profile along a line perpendicular 
to the contour and passing by the landmark point. Find the location that best fits the learnt 


local appearance profile of this landmark. This is the location that minimizes the 


Mahalanobis distance ( h(y,)). The output is a new shape estimate based upon the local 


appearance profiles: 


h(y,)=(y,-Yx)'Sy “(¥;— Yu) > (2.23) 


where jy, is the local appearance profile for the current position in the search profile, and 


¥i3>,. is the local appearance model (mean and covariance matrix) for the k” 


landmark. 


(ii) Update the pose and model parameters (tf. st, 58 ,9 ,b) to best match the 


new shape estimate from (i): 


b= 9 (x, — x) (2.24) 


is Rory (x+ ob). (2:25) 


More details for updating the parameters of the ASM model are given in Appendix B. 


2.5 Active Appearance Model 


The Active Appearance Model (AAM) is a generative model that can synthesize images 
similar to those in the training set [37]. Typically, Active Appearance Models are used for 
image interpretation and synthesis applications. Using a training set of images, we model 
both the shape and texture (pixel intensity) of the object of interest. Image interpretation 
tasks can then be achieved by adjusting the model parameters to fit the new unseen 


image. 


2.5.1 Shape model 


Shape variations across the training set are estimated by using the Point Distribution 
Model (PDM) described in section 2.4. The outcome is an estimate of a mean shape and a 


set of PCA-based basis functions that define the modes of shape variations: 


25 


gi Weed oe vlovinnrat bani euails one ene ) 
aadent ove gubwathed sith te: eau Honest 


oti gaat cet. yer) AOUNOO! ee ee) ee . 
ht 

isco! Sih cbt brane ‘ohaniitas Heynelid 'womer Ge Yueginn vt FA 

in 1 pasa ; 


| sa 
"od! cel (hi sonheves fh st tebom ~onmeoqga Nia ro . 
We 


oil ctaterte head) at ARE Wy da : a) iastommee: te bo an, alt oe, 


= 
Hi 


PI) OSMAN! M183 Bh as oben’ beaivnsa eu . Bra Treats tat 
nt beot ans lobo! Outi oq visa) ef ael vey rd seh et piecasats rat 
habnicn sur eat io toe geriaalaet 6 wfies) .anoite siTagn ioseiigs al 
nplismd int seth tas civajeni Wid “ict eh fd (Aignini \niq) ras be 6 C dock 

aunts Woe eh Ae ni bob: a cues wd) Serbia , 


* Ney 


i Leelee iC tere Shi unter! “a ketisenlazey 7” ws aie “a, 
one sisal fase a Vo Diane cm al choo AE i noice wi stro 
v | ahs ) oe pe Ya ae : 
Arete sl to iad ont ak ef ecto si oe 


X=x+@.b (2.26) 


s ’ 
where x : Mean shape, 
@, : Matrix of shape modes of variations, 
b, : Vector of shape parameters. 


2.5.2 Texture model 
The Active Appearance Model (AAM) also models the texture appearance of the object. 


First, we sample the pixel intensities within the convex hull of the shape of the object. All 


samples are then warped to the estimated mean shape (x) by using the Delaunay 
triangulation method [39]. Next, Principle Component Analysis (PCA) is used to model 


the texture appearance, as well as the combined variations between the shape and texture: 
g=g+9,b,, (2.27) 
where g : mean normalized grey-level vector, 


9, : Modes of intensity variations, 


b, : grey-level model parameters. 


In order to account for any potential correlation between the shape and the grey-level 


model parameters, Principle Component Analysis (PCA) is applied to a concatenated 


vector of b, and b,: 


Wb Q 
te |e oe (2.28) 
Sea eo 


where Q, : Basis matrix of texture model, 


Q, : Basis matrix of shape model, 


Q : Basis matrix of combined appearance model, 


c  : Parameter vector of combined appearance model. 
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The complete model can then be formulated as follows: 


x= x+9,W,Q,c, g=2+9,Q,c. (2.29) 


A new image can be synthesized as follows: 

- Generate a shape-free grey level image by using the appearance parameter 
vector. 

- Warp the new shape-free image to match the landmarks of the target shape. 
Figure 2.10 shows the first two modes of variations for an active appearance model of the 


lumbar spine. 


Figure 2.10 First two modes of appearance model of the lumbar spine 


2.5.3 Image Search 
The Active Appearance model is then matched to unseen images by using least square 


criteria. The algorithm minimizes the difference between the sample from the searched 
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image (g,) and the synthesized sample image (g.,) by varying the model parameters 


which control the shape, texture, and pose. For more details, refer to [38]. 


2.6 Related Work 


This thesis describes new methods for the statistical modeling of the shape and 
appearance of anatomical structures. We apply these models for detecting and quantifying 
of vertebral structures in x-ray images. Figure 2.11 shows a block diagram of the general 
framework of the work presented in this thesis. This section presents a brief review of the 


most relevant work from the literature. 


Figure 2.11 General framework for the work presented in the thesis 
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2.6.1 Statistical Shape Modeling 
Statistical shape models try to faithfully represent the full range of biological variations in 


anatomical structures. These models are now widely used as prior knowledge in a wide 
range of applications such as segmentation, registration, and surgical planning. 
However, the accurate modeling of anatomical structures is a challenging task due to 


shape complexity and the wide range of possible local and global variations. 


Principal Component Analysis (PCA) has been widely used in the literature for modeling 
shape variations. Nevertheless, many researchers now understand that PCA basis 
functions are global in nature, and fail to generalize well to complex unseen 
deformations. To overcome the limitations of PCA-based shape models, several 


approaches have been suggested in the literature [40]. 


Pentland and Sclaroff, [33], suggested a shape model based upon the mechanical modes 
of deformations. This model eliminates the need for a large training set. Cootes and 
Taylor [41] proposed a hybrid shape model that combines both statistical and mechanical 
modes to produce better generalization. The main problem with physical models is that 
the resulting deformation might not reflect realistic shape variations. Wang et al. [42] 
suggested using a mixed covariance matrix, which is defined as a weighted sum of the 
estimated covariance matrix plus another matrix that imposes smoothness. Cootes and 
Taylor [43] suggested modeling the shapes in the training set as a mixture of Gaussians. 
The authors showed that their model could better capture non-linear variations than 
traditional Active Shape Model (ASM). In [44], Bowden et al. proposed a new method to 
overcome the linearity in the PCA model. Data are first clustered into a number of 


clusters. A separate PCA model is then constructed for each cluster. 


The work of Davatzikos et al. [45, 46], Mahmoud [40], and Nain et al. [47, 48], is more 
related to the methods presented in this thesis. In [45], Davatzikos et al. proposed a 
hierarchal shape prior using the wavelet transform and PCA. The authors suggested by 
using the Discrete Wavelet Transform (DWT) to construct a hierarchal representation. 
Principal Component Analysis (PCA) is then used to capture the statistics of the 
coefficients. In [40], Mahmoud suggested by using a wavelet packet basis set. The 


coefficients are then modeled using Principal Component Analysis (PCA). In [47, 48], 
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Nain et al. proposed an approach to model caudate nucleus shapes by using spherical 


wavelets and Principal Component Analysis (PCA). 


In an attempt to overcome the inherent limitations in PCA-based models, a recent trend 
has involved the search for a better deformation basis set more suited for modeling 
anatomical structures. Instead of finding the basis that maximizes the variance in the 
training set, some researchers proposed using a basis set that maximizes the statistical 
independence in the training set. The Independent Component Analysis (ICA) method 
was first proposed for shape modeling applications in [49]. An ICA-based shape model 
represents a shape as a mean shape plus the weighted sum of the independent modes of 
variations. Rogez et al. [50] used an ICA-based shape model for the segmentation of a 
human figure. Uzumcu et al. [51] suggested using Independent Component Analysis 
(ICA) for modeling shapes in medical applications. Koikkalainen et al. [52] proposed a 
combined ICA- and PCA- based model for the segmentation of magnetic resonance 
images. To achieve more interpretable localized models, Sjostrand et al. proposed the use 
of Sparse Principal Component Analysis (SPCA), which imposes a sparsity constraint 
onto the PCA basis [53, 54]. Chennubhotla et al. proposed a method for extracting multi- 
scale structure from data by using sparsePCA [55]. In [111], Olafsdottir et al. compared 
the performance of PCA, ICA, and SPCA in the statistical deformation modeling of a 
Crouzon mouse. In [122], Suinesiaputra et al. presented a method for detecting abnormal 


cardiac contraction in short-axis MR images by using ICA. 


2.6.2 Statistical Appearance Modeling 
Generative models are often used in medical imaging to constrain solutions for various 


complex ill-posed problems. Statistical models of appearance are popular examples of 
generative models. In chapter seven, we present a novel method for enhanced modeling 
of texture appearance, with an application to medical image synthesis. This section 


summarizes some of the related work in the literature. 


In [37], Cootes et al. presented the Active Appearance Model (AAM) as a new method to 
capture combined variations in shape and texture. A review of the basics of the Active 
Appearance Model was presented in section 2.5. Since the introduction of Active 
Appearance Models, the modeling of texture appearance in medical images has been 


performed mainly by using a PCA basis. However, using the PCA basis to model texture 
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appearance has a number of limitations. The PCA basis captures only the global modes of 
texture variation, while recent studies have emphasized the role of locality in medical 
imaging. Most pathologies are often related to local deformations in both geometry and 
texture [55]. In order to overcome the limitations of Active Appearance Models, a 
number of researchers have tried to achieve locality in appearance modeling. Roberts and 
Cootes [56] proposed a method to capture localized texture variations by using linked 
active appearance models. These models are used for the segmentation of lumbar 
vertebrae in digital radiographs. Delac et al. [57] proposed a statistical appearance model 
using Independent Component Analysis (ICA). This model is used for face recognition. 
In [59], Zhang et al. presented a method for modeling human faces with parameterized 
local shape morphing. In [60], Stegmann et al. presented a method for the sparse 
modeling of landmark and texture variability by using the Orthomax criteria. The model 


generates localized shape and texture variations. 


Another challenge that faces traditional intensity-based appearance models is the high 
resolution of medical images. The higher the image resolution, the more unfeasible it 
becomes to model texture appearance by using raw intensity pixels. Researchers have 
suggested various methods for capturing texture variations in other compressed spaces. 
The challenge is to achieve high reduction rates without degrading the performance of the 
model. In [61], Wolstenholme and Taylor proposed using Haar-based wavelet 
compression in Active Appearance Models. A compression ratio of 20:1 was 
accomplished while maintaining acceptable accuracy in the model. In [62], Stegman and 
Cootes extended the work in [61] by presenting a wavelet enhanced appearance model 
using a bi-orthogonal CDF 9-7 wavelet basis. In [63], Larsen et al. presented a texture 
enhanced appearance model using Wedgelets. The Wedgelet representation was shown to 


be superior to wavelet representation at higher dimensionality reduction rates. 


Statistical models of shape and appearance have also been used in the literature to 
simulate deformations in medical images and to generate ground truth data. These tasks 
are becoming important as simulated ground truth data can be useful for validating 
segmentation and registration algorithms. These models can also be used to artificially 
enlarge the size of training sets. Various methods have been proposed in the literature to 
achieve image synthesis. Traditionally, the simulation of the deformations in medical 


imaging is performed by using PCA-based methods. However, these methods are often 
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unable to capture realistic and complex deformations of the anatomical structure. Abboud 
et al. [64] proposed a method for facial expression synthesis by using PCA-based 
appearance models. In [65], Seals et al. proposed a method for enlarging the training set 
by using image morphing. In [66], Lee et al. presented a method for morphing multiple 
images. In [67], Hamarneh et al. presented a method for simulating the ground truth 
validation data via physically- and statistically-based warps. Rose and Taylor [68] 
proposed a statistical model for mammographic appearance by using steerable 
decomposition and PCA modeling. In [69], ElSafi et al. proposed a method for the 
Statistical simulation of deformations by using Independent Component Analysis. In [70], 
Xue et al. presented a PCA-based method for simulating deformations in MR brain 


images. 


2.6.3 Model-based segmentation methods 
The segmentation of anatomical structures in medical images is a pre-request stage for 


various computer-assisted medical applications such as shape analysis, computer-aided 
diagnosis and surgical planning. Manual segmentation methods are both time-consuming 
and non-reproducible. Computer-assisted segmentation methods aim at reducing the time 
and cost of the error-prone manual segmentation process. In chapter six, we present a 
novel model-based vertebral segmentation method using local appearance profiles and a 
multi-scale shape prior. This section summarizes some of the related segmentation 


methods in the literature. 


The work of Staib and Duncan [13] is one of the earliest publications to suggest using 
statistical prior models to guide the segmentation process. The authors proposed using 
elliptic Fourier descriptors to represent the object contour. The Fourier coefficients are 
trained by using a Gaussian distribution as the prior probability. Another milestone is the 
Active Shape Model (ASM) proposed by Cootes in [35]. Some of the basics of Active 
Shape Models (ASM) were presented in section 2.4. In the field of medical imaging, 
researchers now widely accept that the Active Shape model (ASM) suffers from a 
number of inherent limitations. First, being a PCA-based model, it fails to generalize well 
to unseen deformities of the anatomical structure. Second, the intensity-based appearance 
model in ASM inherently assumes that the object of interest has clearly identified 


boundaries, but this assumption is not necessarily true for medical images. 
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Several approaches have been proposed in the literature to address the challenges of 
prior-based segmentation methods. To improve the matching between the prior model 
and the unseen image, researchers have suggested using various image representations 
and features. Scott et al. [71] suggested using local features such as oriented gradients, 
comers and edge strengths. Van Gienneken [72] proposed a new local appearance profile 
using responses from a filterbank of Gaussian derivative functions. In [73], Li et al. 
suggested using steerable filters to extract the local edge features and used these features 
to construct the local appearance profile. In [74], De Bruijne and Nielsen proposed an 
image segmentation method using shape particle filtering. Seghers et al. [75] proposed an 
enhanced gray level local appearance model extracted from feature images. Zhu et al. 
[76] proposed a method to improve the search procedure of the Active Shape Model 
(ASM) by using Mixture of Gaussians for the grey-level profiles. Metaxas et al. [77] 
suggested a hybrid deformable model for segmentation and registration. In [78], Langs 
proposed the Active Feature Model (AFM), in which local texture descriptors are used. A 
number of researchers have also tried to develop more efficient shape prior models. In 
[79], Uzumcu et al. suggested using Independent Component Analysis (ICA) for 
modeling the shape variations in medical imaging. In [45, 80], Davatzikos et al. proposed 
a hierarchical active shape model using the wavelet transform and Principal Component 
Analysis. In [81], Koikkalainen, and Lotjonen presented a method for segmenting the 


atria in MR images by using combined PCA and ICA shape priors. 


2.6.4 Segmentation and analysis of vertebral structures 
In this thesis, we are particularly interested in the segmentation and quantification of 


human vertebrae in x-ray images. Accurate vertebrae segmentation is important for the 
proper assessment of various vertebral abnormalities. This section summarizes some of 


the work in the literature related to vertebral segmentation and shape analysis. 


Over the past decade, several model-based segmentation methods have been proposed in 
the literature to extract both normal and pathological vertebral structures. In [56], Roberts 
et al. proposed a method for segmenting fractured vertebrae by using linked active 
appearance models. Howe et al. [82] presented a hierarchal method for segmenting 
cervical and lumbar vertebrae by using the Generalized Hough transform and the Active 
Appearance Model. The Generalized Hough transform is first used to estimate the initial 
pose. The Active Appearance Model (AAM) is then used to reach the final segmentation. 


The authors reported a segmentation accuracy of 4.35 mm for the lumbar spine. De 
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Brujne et al. [74] reported a mean point-to-contour error of 1.4 mm for lumbar spine 
segmentation using shape particle filtering. This method is most useful in offline 


applications since it has a time-consuming pixel classification stage. 


Statistical shape analysis refers to the ability to understand how anatomical structures 
tend to vary for various reasons such as the development of a certain pathology, aging, 
and surgery. Computer-aided diagnosis for vertebral abnormalities has been the focus of 
various research groups worldwide [83]. The common pathologies of interest include 
anterior osteophyte, disc space narrowing, inter-vertebral disc degeneration, and 


fractures. 


The conventional methods for the diagnosis of vertebral deformities use a six-point 
representation for the vertebral body. Traditionally, most clinical practices have used this 
representation to assess various vertebral deformities. The six-point representation is 
usually used to derive measurements such as Anterior Height (AH), Middle Height (MH), 
and Posterior Height (PH). Based upon these heights, many methods have been proposed 
to quantify various vertebral deformities such as fractures [84, 85]. The major limitations 
of these methods are as follows: (i) they are subjective to the user, and usually fail to 
capture subtle shape changes, and (ii) the trade-off between specefity and sensitivity 


remains problematic. 


Recent studies have attempted to find better shape representations for the analysis of 
vertebral structures. The National Library of Medicine (NLM) has been long interested in 
the vertebral classification of spine x-ray images for content-based image retrieval 
applications. Researchers at the engineering division of the NLM have implemented and 
evaluated several shape representations for the retrieval of spine x-rays by using anterior 
osteophyte pathology. Examples include Fourier descriptors, polygonal approximations, 
invariant moments, convex-hull based features, and partial shape matching using dynamic 
programming [86-89]. Some other researchers suggested using Principal Component 
Analysis (PCA) for the analysis and classification of vertebral deformities. Symth et al. 
[90] suggested using the Point Distribution Model (PDM) to represent vertebral shapes 
and employed a Mahalanobis distance classifier to identify vertebral fractures. Roberts et 
al. [91] suggested using Active Appearance Models for the segmentation and assessment 


of vertebral abnormalities. Promising results were reported. De Bruijne et al. [93, 94] 
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proposed a method for the classification of vertebral fractures in x-ray images by using 
pairwise conditional shape models trained on a set of healthy spines. Roberts et al. [92] 
proposed a method for vertebral classification using parameters from Active Shape 
Models and Active Appearance Models. A final comment regarding vertebral 
quantification is that none of the methods mentioned in this section is capable of 
analyzing multiple vertebrae simultaneously; in other words, no spatial information about 


the pathology is available. 
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3. Multi-scale Medical Image Analysis Tools: 
Wavelets and Beyond 


3.1 Introduction 
Advances in the field of computational harmonic analysis have been inspired partly by 


our understanding of the Human Visual System (HVS). Figure 3.1 shows a model of the 
low-level vision system as described in [95]. Vision researchers now believe that 
receptive field cells exhibit a localized response in both time and frequency. Various 
harmonic analysis tools attempt to mimic the behaviour of these cells. The work of Burt 
and Adelson [157] is among the earliest efforts in this field. The authors introduced the 
Laplacian Transform as a tool for multi-scale image analysis. In an independent effort, 
some mathematicians developed the wavelet transform, in which the basis functions are 
translations and dilations of a mother wavelet function. It is now accepted that a 
considerable amount of similarity exists between the response of the mammalian visual 
system and wavelet basis [95-97]. In [95], Olshausen and Field describe the receptive 
fields in the mammalian visual cortex as follows: “The receptive fields in the mammalian 
visual cortex can be characterized as being spatially localized, oriented, and band-pass 


comparable to the basis functions of wavelet transform.” 


ee Response Che 
Receptive field ‘ oe eins Pointwise 
Image Li non-linearity Response 
inear 
response aay 
Kx,y,t) ———> a oF ie eee ent) 
K(x) i 
neighboring 
neurons 


Figure 3.1 Model of the human low-level vision system [95] 


Figure 3.2 Similarity between receptive fields in mammalian primary visual cortex and Gabor wavelets 
[96]. 
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Traditionally, wavelets have been extensively used for image analysis and interpretation 
tasks [98]. To overcome some of the limitations of traditional wavelets, a number of 
multi-scale analysis tools were introduced, including Non subsampled wavelets, Gabor 
filters, Steerable pyramids, Contourlet transform, and Non Subsampled Contourlet 
transform (NSCT). This chapter provides an overview of some of the recent advances in 
multi-scale image analysis tools. The chapter also discusses the suitability of various 
multi-scale tools for the analysis and quantification of medical images. Please refer to 


[97-99] for a more detailed theoretical background of the transforms. 


3.2 Wavelet transform 
3.2.1 Definition 


Since their introduction, wavelets have become a popular tool for signal and image 
analysis in a wide range of applications including data compression, de-noising, and 
feature extraction. A wavelet transform decomposes a signal over a set of basis functions 
obtained by the scaling and translation of a mother wavelet function. A mother wavelet is 


a function that satisfies the admissibility condition: 


2 
C, = | —Wo <0 (3.1) 


where Ye L’(R), and ‘W(@) is the Fourier transform of W(t). If W(@) has 


sufficient decay, the admissibility condition reduces to the constraint in which the mother 


wavelet function has zero average: 
[ vat =y@)=0. (3.2) 


A family of wavelet basis functions is obtained by scaling and translating Y(t) by a 


andb , respectively: 


1 t—b 
WY, (= TA ° (3) 


Two functions are needed to extract the low and high frequency components: the scaling 


and wavelet functions. Given a wavelet function Y(t) and its companion scaling 


function @(t) , the following condition must be satisfied: 
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$(t) = V2¥ h(b)G(2t -b) (3.4) 
b 


y(t) =V2¥ g(b)g(2t-b), re 
b 
where h(b) & g(b) are related as follows: 
e(b) =(-1)’hU—-b). (3.6) 
The wavelet transform of a signal is defined as follows: 
hi (3.7) 


Xyr(a,b) = | xy, (dr, 


—co 


where xe L’(R). 
Multi-resolution Analysis (MRA) 


By using two orthonormal sets of wavelet and scaling basis functions, the wavelet 


transform can be used to perform multi-resolution analysis. In the space of the finite real 
functions L’(R), we can define a sequence of resolutions such that signals at a scale 


smaller than 2~/ are suppressed at the ii resolution level. Let V, define the subspace of 


—jth 


functions containing the signal information down to the 2.’" scale. The multi-resolution 


representation is then obtained by decomposing the signal into a sequence of subspaces 


Vee eVe (ave CV, CuA(R) pwhere 


V 


J+ 


VWs. (3.8) 
j j 


W,, is the detail space at the j'" resolution level and is orthogonal to the approximation 


space (V,). The approximation space at a certain resolution can then be written as a sum 


of all subspaces, as shown in equation 3.9 [60]: 


Vin =W, OV, =W, OW, OV, =...=W, Ow, OW 2... BW 5 OV... 
(3.9) 
The wavelet transform decomposes the signal into approximation and detail signals. A 
multi-resolution analysis (MRA) can then be achieved by iterating the decomposition 
process onto the approximation signal. The multi-resolution decomposition of a function 


is given by: 
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where dy, ; =< ea. j > and GS ecg ‘if We ui. > are the approximation and 


wavelet detail coefficients, respectively. < ., . > is the inner product that orthogonally 
projects the function f onto the scaling or wavelet basis. M represents the highest 


decomposition level of the wavelet transform. 


The wavelet decomposition of a two-dimensional signal is performed by applying the 1D 
wavelet transform onto the rows and the columns. Figure 3.3 illustrates a single 
decomposition step of a two-dimensional image. The process is repeated iteratively to 
obtain the multi-scale representation. Figure 3.4 illustrates the image decomposition 


using XY separable 2D wavelet filters. 


A 


Figure 3.4 Image decomposition by using XY separable 2D wavelet filters 


3.2.2 Wavelets in medical image analysis 
Traditionally, wavelets have been extensively used in various image analysis and 
computer vision tasks. In the field of medical imaging, wavelets have proven useful in 


applications such as segmentation, registration, and image enhancement [114]. Their 
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usefulness is attributed to a number of appealing properties such as multi-scale 
representation and joint spatial-frequency localization. Over the past years, different 
wavelet families have been introduced in the literature. Each wavelet family possesses its 
own significance and drawbacks. Table 3.1 summarizes the properties of four well known 


wavelet families. 


Table 3.1 Properties of four well known wavelet families 


Compact 


support 


Despite their success, wavelets suffer from a number of limitations affecting their 


performance in medical imaging tasks. Some of these limitations are listed below: 

- Lack of orientation selectivity: Wavelets suffer from poor orientation selectivity. 
Wavelet basis functions have only three permissible orientations. Thus, the 
wavelet transform is not capable of capturing rich orientation information about 
the structure of interest. Figure 3.5 illustrates the incomplete orientation 


selectivity of the wavelet transform. 


Figure 3.5 Wavelet decomposition of an x-ray image of the lumbar spine 
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- Lack of shift invariance: Standard wavelet transform suffers from a lack of shift 
invariance due to the down-sampling step. Wavelet coefficients tend to change 
dramatically due to the translation of the input signal. This problem limits the 


capabilities of the transform as an image analysis and feature extraction tool. 


- Sub-band mixing and information shift: Despite the computational advantages of 
orthogonal wavelets, they suffer from phase non-linearity and weight-shift 
property. The non-symmetry of orthogonal wavelets causes an information shift 
among different bands because of the projection onto biased basis functions. This 
information shift property limits the efficiency of the transform in localized 
image analysis tasks. Figure 3.6 illustrates the information shift property for 


Daubechies wavelets. 


Figure 3.6 Information shift property for four-level orthogonal wavelet decomposition. 


- Response to higher order singularities: In [99], Do and Vetterli showed that 
wavelets are optimal only in capturing point singularities. Wavelets are adapted 
only to abrupt changes in an image due to the XY seperability of the basis 
functions. Thus, wavelets are not optimal in capturing line and curve 
singularities. A two-dimensional wavelet function “sees” a smooth contour as a 
number of point singularities. This feature decreases the efficiency of the 
transform in representing smooth contours and increases its sensitivity to noise. 


Medical images often exhibit higher-order singularities. Thus, the performance of 
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wavelets in medical image analysis is restricted by their ability to capture only 
horizontal, vertical and diagonal discontinuities. This problem is illustrated in 
Figure 3.7(a), where a large number of coefficients is needed to represent a 
smooth contour using wavelet bases. This problem obviously limits the 
performance of the wavelet transform as a feature extraction tool. Figure 3.7 (b, 
c) shows an example where a wavelet-based edge map fails to capture the noisy 


contours of the vertebral bodies in an x-ray image. 


Figure 3.7 Illustration of the limitations of wavelets in representing contours. (a) Wavelet 
representation of a contour, (b) an x-ray image of lumbar spine with noisy contours, (c) wavelet-based 
edge map 
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3.3 Gabor Transform 
3.3.1 Definition 


In the seminal paper “Theory of Communications,” Denis Gabor introduced the concept 
of localization in time and frequency [100]. He argued that a useful decomposition basis 
function should be jointly localized in both time and frequency. The concept of joint 
localization aims at finding the minimal amount of information bounded by the 


uncertainty principle: 


1 (3.11) 
At Af >— 
f An’ 


where At : The time duration of the signal. 


Af : The bandwidth of the signal in the frequency domain. 


The function which achieves the minimal boundary of the uncertainty principle is the 
result of modulating a harmonic oscillation of any frequency with a pulse in the form of a 


Gaussian distribution function [100]. A 1D Gabor function is given by 


9 


tis the centroid of the Gaussian, 


f, is the frequency of harmonic oscillations, 


Q is the phase shift of oscillations. 
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Figure 3.8 (a) One-Dimensional Gaussian function in time and frequency domain, (b) real and 
imaginary parts of Gaussian function 


In [101], Daugman extended the 1D Gabor function to a 2D Gabor filter. A 2D Gabor 
function is defined as a complex sinusoidal plane wave modulated by an elliptic Gaussian 
probability distribution function. Mathematically, the 2D Gabor function is represented as 
follows: 

w(x, y) = oars By? Jel fox 


x'=xcos@+ ysin@ 


(3213) 


y'=—xsin 0+ ycos8, 


where f, is the frequency of the sinusoidal plane wave, @ is the anti-clockwise rotation 


of the Gaussian and the plane wave, @ is the sharpness of the Gaussian along the axis 


parallel to the wave, and £ is the sharpness along the axis perpendicular to the wave. 
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(b) 


Figure 3.9 2-D Gabor filter (a) Real part; (b) Imaginary part 


A set of Gabor filters can form a filterbank with each filter being characterized by 
parameters to specify the filter orientation, spatial frequency, and filter width. The 
filterbank is composed of filters at several orientations and frequencies. It has equal 


orientation and octave-frequency spacings: 


Be aera err fil) (3.14) 


(3.15) 


where Ss (>1) is the ratio between two consecutive frequencies, 


. h : 6 
6, isthe k” orientation, 
N, is the number of orientations, 


f, isthe 1” frequency, 


S max iS the maximal frequency. 
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Figure 3.10 Gabor Filters. (a) Filter parameters in frequency domain; (b) bank of filters in frequency 
domain [102]. 

3.3.2 Gabor transform in medical image analysis 

The Gabor transform has proven to be successful in performing various image analysis 
tasks. In comparison to wavelets, the Gabor transform offers better localization and 
orientation selectivity. In the field of medical imaging, Gabor filterbanks have been used 
for feature detection and representation tasks [103]. Various Gabor-based systems have 
been suggested for applications such as the enhancement, segmentation, and registration 
of medical images [102]. Figure 3.11 shows an example for the decomposition of a 


lumbar spine x-ray image by using a Gabor filter-bank. 


Despite being able to encode rich information about local image structures, Gabor filters 
face a number of challenges. These challenges include the following: 

- The selection of the filter parameters is a difficult task. A trade-off exists between 
the number of filters and the continuity of the feature space. In many imaging 
applications, the continuity of the feature space is a desirable property. 

- In comparison to steerable pyramids [104], a Gabor filterbank does not have 
complete or approximate shift invariance. 

- In comparison to Contourlets [99], Gabor transform allows only for a fixed 


number of orientations per scale. 
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Figure 3.11 Analysis of lumbar spine x-ray by using Gabor Filterbank (six orientations at a fixed 
scale) 


3.4 Steerable Pyramid 
3.4.1 Definition 


The steerable pyramid is a two-dimensional transform that is jointly-shiftable in 
orientation and position. Shiftability refers to the fact that the response at any parameter 
value (position or orientation) can be constructed as a linear combination from discrete 
values of the same parameter space [104]. Shiftability in position does not imply 
complete shift invariance. However, a shiftable transform exhibits “approximate” shift 
invariance where the power of its coefficients is invariant to translations of the input 


signal. 


The basis functions of a steerable pyramid are the translations, dilations, and rotations of 
a single kernel. The transform is constructed as a recursive pyramid. Figure 3.12 shows a 
first-level image decomposition using a steerable pyramid. An input image is 


decomposed into low-pass and high-pass frequency bands. The low-pass frequency band 


is further decomposed into a number of oriented band-pass sub-bands_ B,, B,,....... Bee 


and a lower low-pass sub-band L, . The low-pass sub-band is then subsampled by a factor 


of two, and the set of oriented band-pass filters are recursively applied. 
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Figure 3.12 First level of Steerable pyramid decomposition 


The first low-pass and high-pass filters are defined as 


Iy(1,0) = L(5,0)12 (3.16) 
Hy(1,0)=H(C.8). (3.17) 


where r,@ are polar coordinates. L( )and H(_) are raised cosine low-pass and high- 


pass filters: 


2. poe 
4 


Lr) coe ey ee (3.18) 
2 A 4 2 
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The band-pass filter is a polar separable filter that is defined as follows: 


B,(r,9) = H(r)G, (@) (3.19) 
cos(lo (aay a 
Nee 2 
3.20 
H()= epee a: 
iH 
0, “ea 
4 


paca = (3.21) 


ak 
a (coscte= —)\* 3: 
x Lcos( a rae 


G,(@) = 


0, otherwise 


CKe-1): 


where @, = 2* | ————=——.. 
J K[2(K -1)]! 


Figure 3.13 shows the spectral decomposition performed by the steerable pyramid. The 
basis functions are all dilations and rotations of a mother function except for the inner 


low-pass sub-band and the outer residual sub-band. 


Figure 3.13 Illustration of the spectral decomposition performed by a Steerable pyramid. 


3.4.2 Steerable pyramids in medical image analysis 
- Steerable pyramids overcome many of the limitations in the wavelet transform. 
Steerable pyramids are jointly shiftable in orientation and position. This feature allows 


for better orientation selectivity over multiple scales, as shown in Figure 3.14. 
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Figure 3.14 Analysis of lumbar spine x-ray by using Steerable pyramid 


- In comparison to the Gabor transform, steerable pyramids offer better orientation 
analysis. Steerable pyramids allow for more flexible decomposition of the orientation 


space due to the steerability of the filters. 
F 4K . ; 
- Steerable pyramids are overcomplete. There are ee times as many coefficients in the 


representation as in the original image. This feature is desirable in many image analysis 
and feature extraction tasks. 
- In comparison to Contourlets [99], steerable pyramids can offer only a fixed angular 


resolution over all scales. 


3.5 Contourlet Transform 


3.5.1 Definition 

In [99], Do et al. introduced the Contourlet transform as an efficient directional multi- 
scale tool for image representation. The transform was initially defined in the discrete 
domain. An efficient implementation for the transform was proposed by Do. et al. in [99]. 
The transform is implemented by using a pyramidal directional filter bank (PDFB). An 
image is decomposed into directional sub-bands at multiple scales. This decomposition is 
implemented by using a cascaded Laplacian pyramid, with a directional filter-bank 


applied at each scale. This decomposition is illustrated in Figure 3.15. 
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Figure 3.15 Contourlet decomposition by using ideal Pyramidal Directional Filberbank. The horizontal 
and vertical axes refer to the horizontal and vertical frequencies of an image. The coloured region 
shows the frequency components present in the image. The white region refers to the frequency band 
filtered out. 


The Laplacian pyramid captures the point singularities, and the subsequent directional 


filtering captures the linear singularities. Multi-scale decomposition for the input image 
(, ) is performed by using the Laplacian pyramid. The outcome is a set of J band-pass 


images and a low-pass image, as in equation (3.22): 


bec ieee 3.22 
EE : ee 
1 


where LP, is an operator for the Laplacian pyramid with J levels, 
> is the input image, 


{b ; } is the set of band-pass images arranged in a fine-to-coarse order, 


: . th : 
a, is a low-pass coarse image at the J~ resolution level. 


Next, a multi-directional decomposition is achieved by applying an iterated directional 


filterbank (DFB) for each band-pass image ( b, ) [99,105]. Do et al. constructed a 2-D 
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directional filterbank (DFB) that can be maximally decimated while preserving perfect 


reconstruction. This task is efficiently achieved through an I -level binary tree 
decomposition resulting in 2’ sub-bands with wedge-shaped frequency partitioning. The 
/ -level tree-structured DFB can be equivalently viewed as a 2’ parallel channel 


filterbank with equivalent filters. At the j” resolution level, the band-pass image (b, ) is 


further decomposed by an / j -level directional filterbank (DFB). The outcome is a set of 


l. 
5 J 
response images { C ‘rate 


DFB, Gai ean Oe? 1 | (3.23) 


In the frequency domain, the Laplacian pyramid (LP) creates a sequence of nested 


approximation subspaces, ... V, CV, C Vi, suey Where V, is defined on a uniform grid 


with intervals Qe This pyramid also creates the detail 
subspaces,... W, C W, a W, ..., which represent the difference images in the Laplacian 


pyramid that hold the details necessary to increase the resolution between any two 


consecutive approximation subspaces. The DFB is then applied onto the detail subspace 


W,. Figure 3.16 depicts the division of the frequency space by using the Contourlet 
transform. V, is the approximation subspace. W, is the detail subspace. The index k 


J 


runs from 0 to 2’ —1 directions [99]. 
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Figure 3.16 Division of frequency spectrum by using Contourlet transform.(a) Subspaces generated by 
LP, (b) Subspaces generated by DFB, (c) Complete division generated by PDFB . 


3.5.2 Contourlets in medical image analysis 

The Contourlet transform has a number of interesting properties that make it an appealing 
alternative for medical image analysis. The Contourlet transform provides an efficient 
sparse representation for two-dimensional signals. It possesses a number of desired 
features such as multi-resolution representation and a high degree of directionality, time- 
frequency localization, near-critical sampling, and anisotropy. These features allow 
Contourlets to deal effectively with smooth contours in medical images. The Contourlet 
transform is capable of performing flexible directional decomposition per scale, while 
keeping a near-critical sampling. The Contourlet transform is also computationally 


efficient as it is constructed by using iterated filterbanks. 


Owing to these properties, the Contourlet transform has the potential to outperform the 


wavelet transform in various image analysis tasks. Nevertheless, it still suffers from a 
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lack of shift invariance due to the down-sampling step. Shift invariance is a desirable 


feature in tasks such as feature extraction and edge detection. 


Figure 3.17 Example of basis functions of contourlet transform 


3.6 Non Subsampled Contourlet transform (NSCT) 


3.6.1 Definition 

In [106], Do. et al. introduced the Non Subsampled Contourlet Transform (NSCT) as the 
shift invariant version of the Contourlet transform. The NSCT is a shift invariant multi- 
scale multi-directional image representation tool. The NSCT is constructed by using non 
subsampled pyramids and non subsampled Directional Filter-Banks (NS-DFB). The 
multi-scale decomposition is achieved by using the non subsampled pyramid. The non- 
subsampled Directional Filter Bank (NS-DFB) is responsible for the multi-directional 


analysis per scale. 


Figure 3.18 shows a block diagram for the Non Subsampled Contourlet Transform 
(NSCT). The non subsampled pyramid filters the image into low-pass and high-pass 
bands. The high-pass band is further decomposed by using the non subsampled 
Directional Filterbank (NS-DFB) to obtain several directional sub-bands. The process is 
iterated onto the low-pass sub-band. Figure 3.18(b) shows the resulting partitioning of the 


frequency domain. 
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Figure 3.18 Non Sub-sampled contourlet transform. (a) NSCT filterbank. (b) Idealized frequency 
partitioning, (c) NSCT basis functions. 


Non Subsampled Pyramid 
A non-subsampled pyramid is used to achieve multi-scale decomposition. Shift 


invariance is preserved due to the non sub-sampled decomposition step. The building 


block of the non subsampled pyramid is a two-channel non subsampled 2D filterbank that 


satisfies: 
(3.24) 


H,(2)G, (2) + H,(2)G,(z) =1, 
where H(z) and H,(z) are the low-pass and high-pass analysis filters. G,(Z) ; 


G, (Z) are the low-pass and high-pass synthesis filters. The pyramid is constructed by 


iterating the non subsampled filterbank. 
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Figure 3.19 Non subsampled pyramid decomposition. Frequency response of the building block of the 
non subsampled pyramid. 


Non subsampled Directional Filter Bank (NS-DFB) 

A Non subsampled Directional Filter-bank (NS-DFB) is used to achieve multi-directional 
analysis at every scale. The building block of the NS-DFB is a two-channel non 
subsampled 2D filterbank. Figure 3.20 shows the frequency response of the building 
block of the non subsampled Directional Filterbank (NS-DFB). Finer directional 


decomposition is achieved by iterating the process. 


Ti) 


Figure 3.20 Frequency response of the building block of the Non Subsampled Directional Fiterbank. 


3.6.2 Non Subsampled Contourlet Transform in medical image analysis 

Medical images are often examined to extract features and geometric information about 
underlying anatomical structures. The Non Subsampled Contourlet Transform (NSCT) 
exhibits a number of significant properties that make it suitable for the analysis and 


interpretation of medical images. 
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The NSCT shares a number of interesting properties with Wavelets, such as the 
multi-scale and locality of the basis functions. In addition to these properties, the 
NSCT exhibits non-isotropic basis functions which allow for better orientation 
selectivity over scales. Thus, the NSCT can better capture higher order 


discontinuities in an image. 


In comparison to Gabor filters and steerable pyramids, the NSCT allows for 
variable angular resolutions over the scales. Thus, more powerful orientation 
analysis can be achieved by using the NSCT. The NSCT is also more 


computationally appealing than Gabor filters. 


In comparison to the Contourlet transform, the NSCT is shift invariant. This 
feature means that a pixel in the original image domain will maintain its spatial 
position in all sub-bands of the transform. Thus, we can extract more 
discriminatory and rich information about the structure under investigation. This 
benefit implies that the NSCT would be more suitable for applications involving 


feature extraction and object-tracking tasks. 
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4. Maulti-scale Shape Model Using Best Basis 
Selection Framework and Independent 
Component Analysis 


4.1 Introduction 


In biomedical applications, statistical shape models try to faithfully represent the full 
range of biological variations in anatomical structures. These models are often used in a 
wide range of applications such as segmentation, registration, and surgical planning. We 
are particularly interested in applications to support medical practices related to the 
imaging of the human spine. Medical experts often examine hundreds of spine 
radiographs to determine the existence of various vertebral pathologies. Common 
pathologies of interest include anterior osteophyte, disc space narrowing, and fractures. 
Figure 4.1 shows examples of x-ray images of the human lumbar spine. Figure 4.2 shows 
two vertebrae with localized osteophyte pathology. The accurate modeling of anatomical 
structures is a challenging task due to shape complexity and the wide range of possible 
local and global variations. In this chapter, we attempt to tackle some of the existing 
challenges in shape modelling. In particular, we focus on the following goals: 

- Finding a proper shape representation that accurately characterizes the 

anatomical structure of interest. 
- Constructing an efficient statistical model that captures both local and global 


pathology-related deformations. 


Figure 4.1 Examples of x-ray images of the human lumbar spine [108] 
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Figure 4.2 Two vertebrae with anterior osteophytes. (a) Lumbar vertebra with upper left-hand corner 
osteophyte, (b) lumbar vertebra with upper and lower left-hand corner osteophyte. 


Within this context, this chapter presents a novel multi-scale statistical shape model using 
the concepts of sparsity, dimension reduction, and statistical independence to extract 
multi-scale modes of deformations. Within a best basis selection framework, the 
proposed model benefits from the multi-scale and sparsifying nature of wavelet packets 
and the ability of Independent Component Analysis (ICA) to capture higher-order 
statistics. This advantage allows us to construct a localized, non-uniform shape model 
with clinically relevant modes of deformations. The rest of the chapter is organized as 
follows: section 4.2 discusses some of the limitations of Principal Component Analysis 
(PCA) as a shape representation and modeling tool. Section 4.3 provides a description of 
the dataset used in our experiments. Section 4.4 describes shape modeling as a basis 
selection problem. Details of the proposed multi-scale shape model are presented in 
section 4.5. Experimental results are given in section 4.6. Discussions and conclusions 


are presented in sections 4.7 and 4.8, respectively. 


4.2 Limitations of PCA-based statistical modeling 

Principal Component Analysis (PCA) is a classical technique that is useful mainly for 
data reduction. Given a set of measurements, PCA tries to reduce the redundancy in the 
representation through decorrelation. In the original formulation of the Active Shape 
Model [35, 36], Cootes et al. described a Point Distribution Model (PDM) that captures 
the shape variations over a training set by using Principal Component Analysis (PCA). 
Details of the Active Shape Model were presented in section 2.4. Within the context of 
statistical modeling, PCA has the advantage of producing a compact representation. 


However, recent studies have indicated that PCA-based models suffer from a number of 
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inherent limitations that restrict the overall model performance [40, 49, 107], particularly 


with complex anatomical structures. These limitations include the following: 


1n 


A PCA-based shape model assumes that the data are normally distributed, but 
this assumption is not necessarily true. 

A PCA-based model represents the shapes as a linear combination of the 
eigenvectors plus an average shape. Using the eigenvectors as the basis functions 
limits the ability of the model to capture local variations. A change in the weight 
of one vector affects the entire shape. Thus, a PCA-based model tends to capture 
global variations more often than local variations. Figure 4.3 shows the first 
mode of variation for the Point Distribution Model (PDM) of the lumbar spine. 


Clearly, changing the weight of the mode globally affects the overall shape. 


Figure 4.3 First mode of variation of a PCA-based model of lumbar spine 


When the size of the training set is much smaller than the dimension of the shape 
space, then the sample covariance matrix estimated by using PCA becomes a 
poor estimator of the true covariance matrix. This limitation increases the 


model’s sensitivity to noise and outliers. 


The degree of freedom of a PCA-based model is limited by the number of 
samples in the training set. For a training set of K-1 shapes, each with M degrees 
of freedom and M>>K-1, a PCA-based model will capture only up to (K-1) 
modes of variations. Thus, with few training samples available, the model is not 


able to generalize well to valid shapes that are not present in the training set. 
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Figure 4.4 shows examples where a PCA-based model of the lumbar spine fails 


to capture localized deformations. 


Figure 4.4 Examples for failure of PCA-based model to capture localized vertebrae deformations. The 
solid lines are the actual shapes, and the dotted lines are the predicted shapes. 


In section 2.6.1, we discussed several approaches suggested in the literature to overcome 
the limitations of PCA-based shape models. The work presented in this chapter builds 


upon and extends these research efforts. 


4.3 Data Description 

The proposed multi-scale shape model presented in this chapter is evaluated by using a 
dataset of 40 outlined contours of five vertebrae from x-ray images. The spine x-ray 
dataset was provided by the National Library of Medicine [108]. For each x-ray image, 
the lumbar vertebrae L1-L5 are annotated and labelled as normal or abnormal (i.e., with 
osteophyte pathology) by experienced radiologists. Figure 4.5 shows how a dataset of 
lumbar spine shapes is built. By fitting a B-spline through the selected boundary in each 


image, we obtain a smooth outlined contour of the lumbar spine. 
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Figure 4.5 Building training dataset for shapes of lumbar spine. (a) Sample x-ray image of lumbar 
spine, (b) Boundary points selected by an expert, (c) Interpolated shape of lumbar spine 


4.4 Shape modeling as a basis selection problem 

It is now widely accepted that the PCA basis is not well suited for modeling complex 
anatomical shapes, which normally exhibit localized deformations [49, 52, 54, and 111]. 
Finding a suitable basis set for a particular class of signals is an important task in a wide 
variety of fields. In the context of modeling anatomical structures, desirable properties in 
the modeling basis include the locality, interpretability, and compactness of the 
representation. At the core of the model presented in this chapter are the concepts of 


multi-scale representation, sparsity, and statistical independence. 


4.4.1 Multi-scale analysis 

The concept of multi-scale analysis is supported by our biological vision functions. Over 
the past years, wavelets have been successively used as the predominant multi-scale 
representation tool. The wavelet transform offers an elegant way of scale-frequency 
decomposition by using localized basis functions. More details about wavelets were 


presented in chapter three. Multi-scale analysis involves decomposing the function into a 
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sequence of subspaces V, ...cV_, CV, CV,...c L’(R) and then studying the signal 


response in each of these subspaces. The proposed shape model employs the Wavelet 


Packets basis to obtain a multi-scale representation of the shapes in the training set. 


4.4.2 Sparsity 
Recent studies have emphasized the role of signal sparsity in facilitating data 
interpretation. Sparse coding has proven to be useful in a diverse range of problems 


including biomedical inverse problems, spectral estimation, and data compression [109, 


110]. 


The sparsity of a representation means that only few coefficients are significantly active, 
while most of the other components are close to zero. Most signals can be projected onto 
a sparse space by using a suitable sparsifying basis set, which can be chosen as a fixed set 
of functions (e.g., the Discrete Cosine Transform and the Wavelet transform). Another 
alternative is to search for a best-basis set in an overcomplete dictionary of basis 


functions (e.g., wavelet packets). 


For a vector x€ R” , the true sparsity measure is the quasi-norm L, which is defined as 
lx, #{ie [an]: x, #0}. (4.1) 


However, due to the instability of the true sparsity measure, it is usually replaced by the 


L? norm, which is defined as 


(4.2) 


= 
al P 
l=, {Sir ? O<p<l. 


4.4.3 Statistical independence 
The statistical independence constraint on a basis set means that the basis functions do 


not interfere with each other. This fact can be explained in statistical terms as follows: 
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Let p,(x,) and p,(x,) denote the probability densities of two basis vectors x, and x,, 


respectively. Let p(x) denotes the joint probability. x, and x, are said to be 
statistically independent if and only if the joint distribution function can be expressed as: 
(4.3) 
P(X) = P,(%) P2(%)- 


The statistical independence of x, and x, implies that they are uncorrelated: 


COV (%,,%)) = E{(%,-M,  —M,, )} = ECG )E()- HW, =0. (4-4) 
While covariance captures the second-order dependences between x, and x,, statistical 


independence tries to capture higher-order statistics as well. 


4.5 Methodology 


This section outlines the steps for constructing the proposed multi-scale shape model by 
using best-based selection and Independent Component Analysis. Throughout this 


section, the proposed model is referred to as the Wavelet-ICA shape model. 


We start with N-training samples in the form of K corresponding landmarks. The set of 


training samples are first aligned by using the Procrustes alignment method. Let the n" 


shape vector be represented as 


D Gabi aera 0 eh epee My ape: || (4.5) 


where x and y are the x-and y -coordinates. 
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Figure 4.6 Shape alignment using Procrustes method (a) unaligned shapes of lumbar spine, (b) aligned 
shapes of lumbar spine. 


Figure 4.7 shows the block diagram of the proposed Wavelet-ICA shape model. Given 
the training set of aligned shapes, a Multi-scale sparse representation is obtained by 
projecting the shapes onto a joint best-basis set selected from an overcomplete dictionary 
of wavelet packets. An entropy-based cost function is used for the best-basis selection. 
Next, we benefit from the approximating and sparsifying nature of wavelet packets to 
achieve efficient feature selection. The outcome is a reduced set of wavelet packet 
coefficients with high statistical significance. A spectral clustering step is next applied 
to extract bands of correlated coefficients. The statistical properties of the wavelet packet 
coefficients in each sub-band are then modelled by using Independent Component 


Analysis (ICA). 


65 


' 

; 

\ 

| 

| 

i 

1 y 

i 

| 

; 

' 

; 

; 

} 

| 7 

| 

} f 

Ci) 

{ i 

{ 1 RENN — se ls eg el Latin emilietecmmmncoger cS eo et 
t i , 5 ty is, 

dy nl % ) ) 


boing thin Ct) 2nd coli Vy eagenahe s2wongilaind isi) Daa ona 
sly} 

navits isi ee oq neaeAY nse U be ayy 3a 

ed beady 4i-1 jot i922) Ay "se ig intl 


Wwanonoth vt ae HOSS Ry ura) bata oiglpplgod ii ie 
MON Aalen eyed 190 off ‘oh ba we Habtaaih: e079 bind ee wh 


SoAcomey bonnes ti a ie * He pantianio ant giiesine pe 
baliqgs3kon 2b agate gamaplts tense A sonseMingle Incoeinde tg 
HAvag falore'y sit! to sinaqeiieg’ Tea Heine att vettisign tubs isla a 
“otagemne” inshrieqabn! ai ie lisflatioun nit Sik bheddee shots 1 4 


° ‘ 3 ! 
bate le 
{ 
‘ = i 
i f ne 


Shapes in training set : 


| Subtract mean shape 


Zeros-mean landmark-based representation : 
T 
X, =[4@ +N). 9,0) «7 | 
| Project onto joint best basis 
Sparse shape representation - 


CLG Ne een 


| Subspace selection 


Dimensionally-reduced sparse representation : 


C.=[CM CON. @ ..C AN] .M<N 
| Spectral Clustering 


Bands of correlated coefficients 


| IC A-based modeling 


Shape prior : 


gh ~ sy pg” p® 


Figure 4.7 Block diagram of the proposed multi-scale Wavelet-ICA shape model 
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4.5.1 Multi-scale shape representation 

Within the context of medical imaging, statistical shape models are required to have good 
statistical performance and relevant clinical interpretability. The modeling of anatomical 
structures is closely tied to the problem of efficient representation. Thus, prior to 
statistical modeling, we first find an optimal multi-scale representation of the anatomical 
structures in the training set. This representation is then used to construct a shape model 


with multi-scale and interpretable patterns of variations. 


4.5.1.1 Wavelet Packets 
Over the past years, wavelets have been successively used as the predominant multi-scale 
representation tool [97]. Wavelet packets represent a family of orthonormal bases in L’(R). 


The wavelet packet is defined as [112] 


y(t) = DAY" (2t-k) (4.7) 
k 
2n+1 n (4.6) 
yr" a)=>) gy" (2t—-k) > 
k 
where y°(t) is the scaling function, and y(t) is the wavelet function. (i ¥21 2,4 are the 


low-pass and high-pass filters associated with the wavelet transform. A wavelet packet 


basis is any orthonormal basis selected from the set of functions in equation 4.8: 
VG a Zo Wt =f) (4.8) 


where k is the scale parameter, K is the maximum scale, i= 0..., 2* —1 is the frequency 
index at the k” scale. j= 0..., 2Xo-* _1 is the translation index at the k” scale, and 2" is 
the length of the input signal. 

The wavelet packet library can be organized in a binary tree structure, with each node 
representing the projection of the signal into a certain sub-space, as shown in Figure 4.8. A 


p -level wavelet packet tree divides the frequency domain into 2” bands or sub-spaces. 
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Figure 4.8 The tree of wavelet packet decomposition. Each doublet (k,i) represents the depth(k), and 
the number (i) of the nodes. 


eet {Vii} be the collection of all the wavelet packets of the translations at node (k,i). 
The collection of coefficients corresponding to the disjoint cover of the frequency domain 


will then form an orthonormal basis. Thus, for k levels of decomposition, up to be 


bases are present. 


This flexibility offered by wavelet packets is exploited to construct a multi-scale localized 
statistical model. Figure 4.9 shows an example of vertebral shape decomposition using 
wavelet packets. Due to the local support of the basis functions, they tend to capture 
different characteristics of the lumbar spine, ranging from the global aspects of the shapes 


to more localized deformations. 


68 


' 

AME: tin Te 

' Hades 

' 

Iasmensrety $num ir een tetera URN se 


tie adigals it oes a fact TOG on day 
ae GH) peels 


ASE } AP CtT is aroun Anst Of 1 india shh ho ail wate 
siBEOE veo EsTt of) toa sy¥O0 iohouyt ai ay 26 aitnmpamebiaht ct 
“S ob qe ioiieahmtoseh 1 devat ry jor “i viPed Hononestno mf 


5 
nt 


i. ow , + 


tiegileso! pone a pulang GY Lalo an eb bee oti 
y(t ned yiwoRititD 3B oer bias oto again ne ae a'S ) 
slg Ot Iyer) coe auoroill vided dayton ee, De r 
Meperie coh te rool Ip sold: adi a ae Seastye Ati 


i 


Figure 4.9 Example of shape decomposition using wavelet packets. (a) Mean shape, (b) Mean shape 
plus variations in low frequency bands, ( c ) Mean shape plus variations from all frequency bands. 
(The dotted shape is the original shape). 


4.5.1.2 Optimal representation using best basis selection 
The multi-scale representation is achieved by searching for the joint best-basis in an 


overcomplete wavelet packet dictionary. Let the shape vector be represented as 
2 TE 
PGs (PY Roe hs ek ieeame id (4.9) 


where x and y are x-and y-coordinates. 


We use wavelet packets to find an optimal representation for the shapes in the training 


set. For a p-level wavelet packet tree, there are vis possible orthonormal bases. Each 


basis corresponds to a specific decomposition for the frequency space. Given a library of 
wavelet packet basis, we search for the basis that best represents the shapes in the training 


set. 


Let D denotes a wavelet packet dictionary with all 2”" possible orthonormal 


bases {Q, }. The projection of the shape vector X onto any basis @ is then given by 


CEO Xe (4.10) 
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Let M(g’ X) denote the cost function for projecting the shape X onto the basis gy. The 


best basis in the dictionary D is then the one for which the cost function, M(g’ X), is 
minimized: 


Pro, =argmin, ,(M(g' X)) . (4.11) 


Coifman and Wickehauser [115] proposed a fast bottom-up search algorithm that finds 
the best orthonormal basis from a wavelet packet library according to an additive 
information cost function. In our model, we use the Shannon entropy as the cost function 


for selecting the best basis. The Shannon entropy for a wavelet packet node (k,i) is 


defined as 
; (4.12) 
H(k,ij=—) p,log(p,) , 


‘ 2 
Where p = Ie) 1 , C,, are the wavelet packet coefficients at the node (k,i), X is 
j : 
|x| 
the observed shape signal, k is the scale parameter of the wavelet packet tree, and i is 
the frequency index at the k” scale. j is the translation index at the k” scale and i” 


frequency index. 
Using the best-basis selection framework, we obtain a multi-scale sparse representation 


for the anatomical shape of interest. In statistical terms, this result means that the 


distribution of the coefficients is highly peaked, as illustrated in Figure 4.10 
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Figure 4.10 Probability distribution of the multi-scale shape representation of the shapes in the 
training set (Kurtosis = 90.1136) 


Next, we use the training set of shapes to obtain a jointly-optimal multi-scale 
representation. We evaluate the significance of each basis function in representing the 
shapes in the training set. This process includes a feature-selection step in the wavelet- 
packet domain. The parsimonious representation of the shapes in the wavelet packet 
domain suggests that only a few coefficients are needed to represent a given shape. This 
sparsity also implies that the wavelet packet basis functions are not equally important for 
the statistical modeling task. Feature selection is achieved by discarding the basis 
functions that do not seem to contribute significantly to the overall energy of the 
population represented by the training set. The basis function whose coefficients tend to 
be constant is also less significant than the one whose coefficients exhibit more variations 


over the training set. 


Let X,=[x,(1) ..x,(2)....4(N), y,0) .y,(2)-¥(N)]} be a zero-mean 


landmark-based representation of the i shape signal, with N vertices. x, and y, 


represent the x- and _ y - coordinates respectively. 
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representation of the shape signals x, and y, in the wavelet packet domain 
andi =1....K , where K is the number of shapes in the training set. The wavelet packet 
representation C;, is obtained by projection of the shape signal X, onto the best wavelet 


packet basis Q,,.: 


OSCR (4.13) 


Let E be the energy of the i” shape in the training set represented in the multi-scale 


wavelet packet domain: 


Bae any C7 (1): | (4.14) 


The total energy of the shapes in the training set can then be defined as follows: 


Eat = UE, = LAL IG (ny +C7 "13. (4.15) 


n=) 


Using equations 4.14 and 4.15, we define a measure for the contribution of each basis 


function to the overall energy. The contribution of the n"" basis function is defined as 


S(n) = s [Cr my +70)? |} Eo (4.16) 


Next, we rank the basis functions according to the contribution measure S(7) . We retain 


a sub-set of the basis functions that contribute to a percentage of 99.98 % of the total 
energy of the shapes in the training set. This feature selection step improves the 
statistical significance of the model. Table 4.1 shows the percentage of the energy 
captured for different target compression percentages when the size of the training set is 
fixed at 25 shapes. A compression percentage of 60 % is achieved with only 0.0526 % 


loss of the total energy. The mean square reconstruction error is kept below 0.5 pixels. 
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Table 4.1 Total energy captured for a number of target compression percentages, with the 


size of training set kept fixed to 25. 


35 40 45 50 55 
% of energy | 99.992% | 99.9882% | 99.9827% | 99.975% | 99.9640% | 99.9474% 


Figure 4.11 shows an example of a shape reconstructed by using the optimal wavelet- 


Target comp. 


packet representation. This figure reveals that the sparsifying nature of the wavelet-packet 
basis leads to a useful dimension-reduction property without introducing significant 


reconstruction errors. 


Within the context of anatomical shape modeling, the proposed multi-scale shape 
representation has the following merits: 
- First, constructing a statistical model based upon the proposed representation will 
enhance the model generalization, with a limited number of training samples. 

- Second, the feature selection step will also have a positive impact on the 


statistical significance and interpretability of the model. 
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Figure 4.11 An example for shape reconstruction using the proposed multi-scale representation. 
(a) original shape, (b) reconstructed shape from the optimal multi-scale space (50% reduction). 


4.5.2 Constructing the shape model 

Using the multi-scale shape representation described in section 4.5.1, we next attempt to 
reveal the hidden independent sources of variations within the training set. First, we 
group the correlated wavelet packet coefficients into separate bands by using the spectral 
clustering technique [113]. The coefficients in the same band represent the areas of the 
shape that have correlated variations in the population. Next, we use Independent 


Component Analysis (ICA) to capture the localized variations in each band. 


4.5.2.1 Spectral clustering 

Prior to modelling the wavelet packet coefficients, we group the coefficients into a 
number of bands. The correlated coefficients are grouped together, while the coefficients 
across the bands tend to have minimum cross-correlation. This procedure is done by 


using the spectral clustering technique in the wavelet-packet domain [113]. 


The spectral clustering method tries to group highly correlated coefficients. Clustering is 


viewed as a graph-partitioning problem. The elements to be clustered are viewed as the 


nodes of a graph. Any two nodes {n,;,1,} are connected together by an undirected edge 
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with a weight w, . The weight on the edge reflects the degree of dissimilarity between the 


two nodes. Clustering is then achieved by finding the optimal partitioning for the 
similarity graph by using the Maximal Cut algorithm [113], which divides the graph into 
a number of sub-graphs, such that the nodes within a sub-graph have the highest 
covariance, and the nodes across the sub-graphs have the lowest covariance ( see Figure 


4.12). 


Figure 4.12 Illustration of the spectral clustering algorithm - undirected graph representation with 
nodes (n1, n2, n3, n4), and weighted edges 
Given the training set of the shapes in the optimal wavelet-packet domain, we use the 
spectral clustering technique to cluster correlated wavelet packet coefficients. A fully 
connected undirected graph is first constructed with the wavelet packet coefficients as the 


nodes and the weights of the edges (w,) defined as the covariance among the 


coefficients in node (i, j) . By recursive graph partitioning of the coefficients, the highly 
correlated coefficients are grouped together into separate bands, with a minimum cross- 


covariance for the coefficients across the bands. The outcome is a total of B bands. 


4.5.2.2 Statistical modeling using Independent Component Analysis (ICA) 

After we have discovered the bands of the correlated coefficients in the wavelet-packet 
domain, the next step is to uncover hidden modes of variations in each band. This task is 
performed by modeling the coefficients in each band by using Independent Component 


Analysis (ICA). 
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Independent Component Analysis 

Independent Component Analysis (ICA) is a generative modeling technique in which an 
observed multi-dimensional signal can be approximated as a weighted sum of the 
components that are as statistically independent as possible. ICA is often used to separate a 
group of mixed signals into a number of independent sources (independent components) 
with no knowledge about the mixing or the un-mixing matrix. Since its introduction, ICA 
has proven useful in a wide range of applications such as image feature extraction, 
denoising, blind source separation, and Functional MRI data processing [116]. Recently, 


ICA has been introduced as a useful tool in modeling anatomical structures [49]. 


In mathematical terms, the ICA modeling problem can be formulated as follows: Let X 
be a matrix with a number of observation signals coming from mixtures of unknown 


source signals (Independent Components -ICs): 


K SAS vy (4.17) 


where A is a matrix containing the mixing parameters, and S$ is the matrix of the source 
signals or independent components. The goal is to find the original source signals from the 


mixture X by estimating the de-mixing matrix U : 


S=UX (4.18) 


Independent Component Analysis (ICA) seeks to extract statistical independent source 
signals by minimizing the statistical dependence among the components of the observed 
dataset (X ). Several cost functions can by used to quantify the statistical dependence 
among the components of X . Two popular cost functions are the kurtosis and the mutual 


information [116]: 


Kurtosis (X ) =E(X*)—3 (E(X’))’ (4.19) 


Ee att x) 1 OX): (4.20) 


i=] 


ICA-based modeling 
In the context of statistical shape modeling, ICA tends to capture more localized shape 


variations than PCA, which is restricted to second-order statistics. The proposed multi- 
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scale statistical shape model uses ICA to model the wavelet packet coefficients in each 


sub-band. 
Let C) denote a vector of the wavelet-packet coefficients in the sub-band b, for shape7. 


For every sub-band, we calculate the mean coefficient vector over the training set: 


Gt y C! (4.21) 
where e is the average coefficient vector in sub-band /. 


Next, we consider the zero-mean coefficients in each sub-band b, as a mixture of the 


independent source signals with an unknown mixing matrix A. Any ICA algorithm can 
then be used to estimate the independent source signals by using the de-mixing matrix. 


We employ the FastICA algorithm to estimate the independent sources [117]. 


Leet be the matrix of wavelet packet coefficients in sub-band b, for all shapes in the 
training set: 
l l l 
C =|¢ oe Galt (4.22) 


Then the Independent Component Analysis (ICA) problem can be formulated as follows: 


IC On 
dC: = Aus: (4.23) 
S'=W dG 


where S’ is the matrix with the independent components (ICs) in sub-band b,, A is the 


AN 


mixing matrix, W is the de-mixing matrix, and S ’ is the estimated matrix of ICs in sub- 


band b,. 


Figure 4.13 summarizes the procedure for constructing the multi-scale Wavelet-ICA basis 


of the proposed model. The complete multi-scale statistical shape model then includes the 
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joint best-basis matrix (B,,., ), the membership of each coefficient in a specific band, the 


est 


average of the wavelet packet coefficients in each sub-band ( C“ ), and the estimated 


() 


independent source matrix for each sub-band (S“’). The proposed multi-scale shape 


A 


model can then be written as in equation 4.24, with $ as the ICA basis in 


sub-band /: 


A A 


C° See 1B. (4.24) 


Shape 1 WICA basis 1 


Wavelet best basis {B} 


a 


e ana mn naar cms od 


sas 


Wavelet Packet best-basis representation Multiscale Wavelet-ICA basis 
(WICA) 


N input contour shapes 


Figure 4.13 Construction of multi-scale independent Wavelet-ICA basis (WICA) 


4.6 Experiments 


In this section, we evaluate the performance of the proposed multi-scale shape model. We 
use a dataset of 40 outlined contours of five vertebrae from x-ray images of the lumbar 
spine [108]. Figure 4.14 shows an example of x-ray images with outlined vertebrae 
contours. First, the proposed statistical model is constructed by using a training set of 
shapes. Next, we evaluate the ability of the model to reconstruct unseen examples. 
Throughout this section, the new multi-scale shape model is abbreviated as BB-ICA. 
Experiments were conducted to achieve the following goals: 
- Illustrate the ability of the new statistical model (BB-ICA) to capture local and 
global modes of deformations. 
- Compare the performance of the proposed model against 
i- The conventional PDM model (PCA-based) [35,36]. 
ii- Another related model from the literature that is based upon PCA 
analysis in the wavelet domain (BB-PCA) [80]. 


- Investigate the robustness of the new model to noise. 
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Figure 4.14 Example of x-ray images with outlined vertebrae contours [108] 


We use Daubechies wavelets with six vanishing moments to build a six-level wavelet 
packet tree. The optimal shape representation for the shapes in the training set is obtained 
by using the method described in section 4.5.1. The FastICA algorithm is used to perform 
the ICA-based analysis in the clustered wavelet packet sub-bands [117]. 


4.6.1 Modes of deformations 


This section evaluates the ability of the new model to capture local and global 


deformations. 


Statistical independence 


We use the mutual information as a measure of the statistical independence between the 


basis functions (or modes). The mutual information between two basis functions Y, and 


@, is calculated as follows: 


1(Q,,9,)=H(9,)+H(9,)-H(Q,,%). (4.25) 


where H(Q,)=—E | log(P,, (g,)) | 


Figure 4.15 compares the statistical independence for the PCA basis, ICA basis, and the 
basis of the proposed model (BB-ICA). The PCA basis clearly has higher-order 
dependencies. In comparison to the ICA basis, the proposed model reduces the average 


mutual information by almost 60 %. 
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Figure 4.15 Evaluation of the statistical independence of the basis functions for PCA-based model, 
ICA-based, and the proposed BB-ICA model. 
Multi-scale modes of deformations 
Figure 4.16 shows the different modes of variations obtained by using the proposed 
multi-scale model. This figure reveals that the model is capable of capturing both the 
local and global deformations. The first mode (C1) tends to capture the global variations 
in the lumbar spine. In the subsequent modes, we can observe more localized pathology- 


related deformations. 
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Figure 4.16 Multi-scale modes of the deformations of the proposed multiscale shape model. 


4.6.2 Shape estimation 


Next, we conduct experiments to evaluate the ability of the new model to reconstruct 
shapes not seen in the training set. We use the leave-one-out experiment to compare the 
performance of the new model (BB-ICA) against 

1- The conventional PCA shape model (PCA) [35]. 

ii- A recently proposed model based upon PCA analysis of the wavelet sub-bands 


(BB-PCA) [80]. 


The leave-one-out experiment is conducted for several times by using a set of T training 
samples, where T= [5, 10, 15, 20, 25, 30, 35, 40]. The model is constructed by using any 
T-1 training samples, and we measure the reconstruction error for the left-out sample by 


using the constructed model. The reconstruction error between the estimated shape 


81 


ony semi a ta i wn 


iv ie \ - ' p ( " 7 am rip a A a m sand 
| TP ee Le oS 
- ; f } : j ne / vs rh ; = a 


J atemnasw of ices won sa y iti ont. vleudars ot sansenivnyAs.f ; fei | 
ond oregenoy ot mnie 6 -9% weed wid oes JW 090 guintied oil vi rage 8 
“ening (491- al) Isbory won oat os 


Hons dive sslovane 00 | — na 


4 i tre 1 anitined i Ach ao Me 
p Leiba i 26 08 8.00 21 eT oe 


Rec, and the ground truth shape GT, is defined as the root mean square error 


between the vertices of the ground truth and the reconstructed shape. For a given number 
of training samples, the overall reconstruction error of the model is defined as the 
average reconstruction error. This procedure is repeated for the three models under 


investigation: PCA [35], BB-PCA [80], and the proposed model (BB-ICA). 


Figure 4.17 shows the plots for the average reconstruction error calculated for the 
different sizes of the training set. The curves in this figure reveal that the new model 
outperforms the other two approaches, even with few training samples available. The 
new model achieves an average error reduction of 30 % and 17 % over the PDM and the 
BB-PCA models, respectively. Thus, the proposed multiscale model appears to better 


generalize to the deformities unseen in the training set. 


Figure 4.18 shows the reconstruction of a ground truth test shape by using the PCA 
model and the proposed multi-scale model (BB-ICA). Both models were constructed by 
using 15 training samples. While the PCA model seems to be unable to capture the local 
deformations in the vertebrae bodies, our model appears to more closely follow the test 
shape. Figure 4.19 shows a zoom-in to the reconstruction of two deformed vertebrae by 
using both the new multi-scale model and the PCA-based model. Again, the proposed 


shape model tends to more closely follow the original contour. 
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Figure 4.17 Average reconstruction error for the following shape models: PCA-based, best-basis PCA 
(BB-PCA), and the proposed multi-scale best-basis ICA model (BB-ICA) 
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Figure 4.18 Reconstruction of a ground truth test shape with models constructed by using 15 training 
samples: (a) GT, (b) PCA-based model, (c) BB-ICA model 


83 


anivilesd &f guivinvd bola 


Ex 


{ om 

oS, Rae | 
A gta Lie, Ct 
by ae 


a bs Lc ttn coved 

L. 

i 

NES ave We Be a ee a : 
aah: cite 0 Ty 
"tend -) AIT shalicney ke npboustic? ay as 
ie oath ERNE irohuavins 7 PP a ath os rf 
ns ptr oe Ae 
ee ee 


sae eh) iA 


Te el eat eee ee] 


Ved 
4h 
> 
1 
wine ARP 


| j 


Vu q wipe ‘ 
4 he fe ; ' i 
| eee a ° : 
~ : eT an, re j 
F J. i Oe 4a 7 2 
a et Rt — er et ees . 
‘i el my : Th 


shh, srr nie apni sush ati baiuerig te noiiaeraadae 
cial tal eg 


eae mea) ty ai) — ) 


tabost a” 


a 
4 
‘ i a. 
re LSE i i 
if 
el t 
aA ee, 
Aye 
t 
& e vf 


| —— Ground truth 
—+ PCA-based 
~— BB-ICA based 


— Ground truth 
—— PCA-based 
-— BB-ICA based 


(b) 


Figure 4.19 Zoom-in to the reconstruction of two L2 lumbar vertebrae by using BB-ICA, and PCA- 
based models 
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4.6.3 Robustness to noise 


Finally, we evaluate the robustness of the proposed shape model to noise by adding 
Gaussian noise to the test shapes and calculating the average reconstruction error by 
using the leave-one-out experiment. Figures 4.21 and 4.22 show that the proposed model 


is minimally affected by noise. 


-— cemT 
——_ BE-ICA based 


Figure 4.21 Reconstruction from a noisy test shape by using 20 training samples. (a) Ground truth 
noisy shape, (b) PCA-based reconstruction, (c) BB-ICA based reconstruction. 
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Figure 4.22 Effect of noise on the reconstruction error of the proposed model. 
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4.7 Discussion 


In this chapter, we presented a novel multi-scale shape model for modeling the local and 
global variations in anatomical structures. Our model integrates the concepts of multi- 
scale decomposition, sparsity, feature selection, and statistical independence to extract 
sets of multi-scale modes of deformations. This section discusses the main aspects of the 


new model in comparison to related work in the literature. 


Compared to the conventional PCA-based Active Shape Model (ASM) [35], the proposed 


model has better generalization ability, particularly when few training samples are 
available. For a set of K training samples, our model can have up to K —1 degrees of 


freedom from each sub-band, in comparison to a total of K —1 degrees of freedom for 


the ASM model. 


We also compare the proposed model and the approaches proposed by Nain [47], 
Davatzikos et al. [45], and Mahmoud et al. [80]. Our model uses ICA for wavelet packet 
sub-band analysis, whereas all three models in [47, 45, and 80] rely on PCA as the 
statistical modeling tool. By using PCA to analyze wavelet sub-bands, we are restricting 
the model to second-order statistics. However, wavelet coefficients tend to be sparse, 
with non-Gaussian distribution, particularly for higher frequency bands. This feature 
encourages using Independent Component Analysis (ICA) for the analysis of the wavelet 
packet sub-bands. ICA assumes that the coefficients in each sub-band are mixture 
observation signals made up of independent components. Using the central limit theorem, 
we can assure that since the observations (wavelet packet coefficients) are non-Gaussian, 
the sources will be even more non-Gaussian. This result implies that ICA would be more 
successful than PCA in searching for hidden independent modes of variations in the 
wavelet packet sub-spaces. This argument agrees with the experimental results reported 


in section 4.6. 


In [47], Nain et al. introduced a multi-scale shape model using spherical wavelets and 
PCA. This model is limited to shapes with spherical topology, due to the use of spherical 
wavelet basis functions. On the other hand, our model constructs a multi-scale shape 
representation by searching for the best basis in an overcomplete dictionary of basis 
functions. This feature allows for more flexibility in modeling variations of the different 


shape topologies. 
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In [54], the authors presented a localized shape model using Sparse Principal Component 
Analysis (SPCA). In this model, locality is achieved by explicitly imposing sparsity 
constraint onto the loadings of a regular PCA. Doing so implies that the resulting shape 
model will still suffer from some of the inherent limitations of PCA-based models. On the 
other side, our model is capable of capturing both global and local variations by first 
projecting the data onto an optimal multi-scale sparse domain and then applying a higher- 


order statistical analysis technique. 


4.8 Conclusion 


This chapter tackled some of the challenges in the statistical modeling of anatomical 
structures. Throughout the chapter, we had two main goals: 
- First, finding the proper shape representation to accurately characterize 
anatomical structures. 
- Second, constructing an efficient statistical model to capture both the local and 


global pathology-related modes of variations. 


We presented a novel multi-scale statistical shape model that uses the concepts of 
sparsity, dimension reduction, and statistical independence to extract sets of localized 
clinically- relevant modes of deformations. Within a joint best-basis selection 
framework, the proposed model benefits from the multi-scale and sparsifying nature of 
wavelet packets and the ability of Independent Component Analysis (ICA) to capture 
higher-order statistics. The new method was applied to model variations in the human 
lumbar spine. Our experiments have demonstrated the ability of the proposed model to 
capture both local and global deformations. The novel multi-scale shape model can be 
beneficial in a wide range of medical imaging tasks. In the subsequent chapters, we 
utilize this shape model for the segmentation and quantification of vertebral structures in 


digital x-rays. 
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5 Vertebral Shape Analysis and Classification 
using Localized, Pathology-related Shape 
Parameters 


5.1 Introduction 


Statistical shape analysis is used to understand how anatomical structures tend to vary for 
various reasons such as the development of a certain pathology, aging, and surgery. A 
long-standing goal for medical imaging research has been to enhance the ability to 
distinguish between the anatomy of healthy individuals and those with pathology. This 
ability can allow for the accurate early detection and diagnosis of diseases. It also opens 
the door for a wide range of applications such as surgical planning and digital archiving. 
Despite the research efforts in the field, the efficient analysis and classification of 
anatomical structures continues to be a formidable task. Anatomical structures often 
exhibit a wide range of variations due mainly to inter-subject variability and structural 


differences among different pathologies. 


This chapter focuses on the analysis and classification of human vertebral structures. 
Radiographs of the human spine are frequently examined to assess vertebral 
abnormalities. Medical experts often examine hundreds of spine x-rays to determine the 
existence of various vertebral pathologies. The common pathologies of interest are 
anterior osteophyte, disc space narrowing, and fractures. By carefully inspecting the 
outline shapes of the vertebral bodies, experts are able to identify and assess vertebral 
abnormalities with respect to the pathology under investigation. Features like osteophyte 
(bony growths on vertebral corners) and disc space narrowing are often used as visual 
evidence of osteoarthris (degenerative joint disease). Morphometric measurements of 
vertebral bodies are often used as evidence of vertebral fractures and ostereoposis (a 
skeletal disease characterized by a decline in bone mass). Traditionally, vertebral shape 
analysis has been performed by using a few points that outline vertebral bodies. There is 
now a growing trend towards shape-based methods for better quantification of vertebral 
deformities. Most vertebral shape analysis and classification schemes rely on global 
error-prone representations. However, most pathologies tend to be localized. Thus, new 
shape analysis and classification methods are needed to capture subtle pathology-related 


information. 
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In this chapter, we present a novel framework for the analysis and classification of 
vertebral structures using pathology-related Wavelet-ICA shape parameters. For vertebral 
shape analysis, we use the parameters of the multi-sale shape model proposed in chapter 
four, in combination with regression analysis techniques. Using the proposed framework, 
we are able to establish a statistically relevant relationship between the Wavelet-ICA 
modes of deformations and clinical information. This ability allows for the anatomically 
meaningful interpretation of the multi-scale modes of variations. Furthermore, we present 
a novel method for computer-assisted multi-vertebrae classification. The new 
classification scheme incorporates the following steps: localized shape modeling, feature 
selection, and statistical classification. We demonstrate the ability of our proposed 
method to identify local shape changes in vertebral structures related to various 
pathologies. Thanks to the locality of the features, the new classification scheme can 


assess the condition of several vertebrae simultaneously. 


The rest of the chapter is organized as follows: Section 5.2 presents the details of the 
proposed shape analysis and classification framework. Our experiments and results are 
presented in section 5.3. Discussion and conclusions are presented in sections 5.4, and 


5.5, respectively. 


5.2 Methodology 
5.2.1 Statistical analysis of vertebral structures by using Wavelet-ICA 
representation. 
We present a novel framework for the analysis of vertebral structures by using the 
parameters of the multi-scale shape model proposed in chapter four. We refer to the shape 
parameters as the Wavelet-ICA representation (WICA). The proposed framework 
incorporates the following: rich multi-scale representation of vertebral shapes, feature 
selection approaches, and statistical analysis techniques. We aim to achieve the following 
goals: 

a. Discriminate between two groups of normal and pathological vertebrae. 

b. Establish a relationship between the localized modes of variations and the clinical 

outcome variables. 

Figure 5.1 shows a block diagram of the proposed vertebral analysis framework using 
Wavelet-ICA representation (WICA). We also compare the proposed framework against 


a PCA-based point distribution shape analysis scheme. 
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Figure 5.1 Vertebral Shape Analysis framework by using Wavelet-ICA representation 


5.2.1.1 Establishing clinical relevance by using regression analysis 
We use regression analysis to establish the relationship between the Wavelet-ICA shape 


parameters (WICA) and a clinical outcome variable. 


Regression analysis is a technique that is normally used to assess the relationships 

between two or more variables. The relationship between the variables is generally 

expressed as (5.1) 
y Se. 

where X is called the independent variable (predictor), YY is the dependent variable 


(response), and € is a random error. 


The WICA-based regression analysis framework 
We use a regression model to establish the relationship between the WICA shape 
parameters and a clinical dichotomous outcome variable which can have two values: 


either O or 1. “0”s correspond to a normal vertebra, and “1”’s correspond to a vertebra 


with a pathology. 
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Let POC a be the matrix of the multi-scale Wavelet-ICA localized basis functions: 
Dwica = Ka wee Du | ee 
The projection of the shape *; onto the set of basis functions AI aa is referred to as 
the Wavelet-ICA representation (WICA): 
Vie Yul (5.3) 


For a number of N training samples, we construct a score matrix Y of the Wavelet-ICA 


shape representation: 


Vit Viger Jim 
Y =| at eager oM (5.4) 
YK Yagerseee YKM 


where Yj; is the projection of the shape ~; onto the Wavelet-ICA basis 9; Wie is the 


size of the training set and M is the number of wavelet-ICA basis. 


oS stg ¢ “th : one 
Let C; be the clinical outcome variable for the 1 vertebra in the training set. We apply 
a series of univariate regression tests to establish the relationship between each 


deformation mode, represented by the corresponding shape parameter ( Yj), and the 
clinical outcome variable C; : 

— (5:5) 
where Y j is the continuous dependent variable which represents the projections of all 


sth = 
shapes in the training set onto the J basis. C= Re Res Cc Fal is the binary 
independent variable that represents the clinical information. joy is the regression 


coefficient which encodes the mean difference between the groups of the normal and 


pathological vertebrae on the continuous outcome variable. 
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We use the least square method to find the regression coefficients, which are the ones that 


minimize the sum of the squared residual error (RSE): 


RSE(B,) =D) (¥5 Gs (5.6) 


A 


6, =arg ming {RSE(B,)}. (5.7) 


The minimization problem is solved by differentiating equation 5.6 with respect to B fe 


and setting the results equal to zero: 


B. 2 =e (5.8) 


Hypothesis testing 

Next, we perform hypothesis testing by using the non-parametric method [118]. This 
testing procedure evaluates the statistical significance of the regression coefficients. The 
result of this test is a range of t-scores with the corresponding p-values which is the 
probability that a significant relationship is declared when the variables are unrelated. 
Usually, a p-value of less than 0.05 indicates a significant relationship between the model 
variable (i.e., its deformation modes) and the clinical outcome variable. The non- 
parametric permutation test is a type of hypothesis testing that does not need parametric 
reference distributions. In non-parametric testing, a reference distribution is obtained by 
calculating all possible values of the test statistic under the re-arrangement of the labels 
on the observed points. We also perform a 5% false discovery estimation procedure in 
order to control the expected proportion of false positives due to multiple comparisons. 


More details about non-parametric hypothesis testing methods can be found in [118,123]. 


Using regression coefficients and their corresponding p-values, we are able to establish a 
relationship between the localized modes of deformations of the multi-scale shape model 
and clinical information. This ability also reveals the clinical relevance and 


interpretability of the model proposed in chapter four. 
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5.2.1.2 Discriminant Analysis 
We use class-disciminability to assess the ability of the Wavelet-ICA shape parameters to 
distinguish between normal and pathological deformations. For each deformation mode, 


we calculate the disciminability ratio (7) defined as the ratio between between-class 


variability (Op emeen ) and within-class variability (O,,,.,;, ): 


OF eracen mn DE (LL, — pl) (5.9) 
O vithin Sea — [Ly (5.10) 
JN 
r= Onepwéen (5.11) 
within 


where, {/ is the overall mean of the coefficients over the training set. /Z, is the average 


of the coefficients for class 1. y,, is the coefficient of the j"" training shape in class 1. 


By calculating the discriminant ratio (7 ) for each deformation mode, we can evaluate the 
ability of the representation to distinguish between the classes of normal and pathological 


vertebra. 


5.2.2 Multi-vertebrae classification 
An essential step in a computer-aided diagnosis system is the automatic classification of a 
patient’s pathology. Computer-assisted classification methods aim at reducing the time 
and cost of the error-prone and time-consuming manual classification process. In medical 
imaging, an efficient classification scheme includes the following: 

- Arich representation that can capture important anatomical information. 

-  Anefficient classifier that can distinguish between different classes. 
The lower the dimension of the feature space, the more manageable is the classification 
stage. However, most pathologies are related to local deformations. Thus, the 
challenging task is to find a suitable low-dimensional representation that can still capture 
subtle pathology-related information. In this section, we propose a new multi-vertebral 


classification scheme using localized pathology-related shape parameters. 
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At the core of the proposed framework is a multi-scale localized shape representation 


based upon wavelet packets and Independent Component Analysis (ICA). The multi- 


scale shape model presented in chapter four can extract sets of localized modes of 


deformations specific to the vertebra under investigation. By projecting the shapes onto 


any specific set of deformation bases, we obtain low-dimensional features linked to the 


vertebra of interest. These features are then used as input to a Fisher classifier to classify 


the target vertebra as normal or abnormal. 


This section outlines the details for the proposed classification scheme. A block diagram 


of the model is shown in Figure 5.2. The proposed method incorporates the following key 


features: 


ii 


A localized multi-scale shape model that uses the concepts of sparsity, dimension 
reduction, and statistical independence to extract localized modes of 
deformations. 

A set of pathology-related Wavelet-ICA shape parameters obtained via basis 
sorting and feature selection. 


Simultaneous supervised classification for all the vertebrae of interest as normal 


or abnormal (i.e., with pathology). 


Original shapes of Multiscale best-basis ICA-based subspace 
Lumbar spine representation analysis 


; 


Normal / Abnormal 


Figure 5.2 Block diagram for the proposed vertebral classification scheme. 
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The following summarizes the steps for the proposed method: 


5.2.2.1 Multi-scale Wavelet-ICA shape representation 


Let a zero-mean shape vector be represented as 


Ae | ety Vin | (5.12) 


where x and y are x-and y-coordinates. 


We construct the multi-scale shape model described in chapter four. Let DcA be the 


matrix of the multi-scale basis functions. In the shape space, Dron constitute sets of 


localized pathology-related deformations. The Wavelet-ICA representation y, is the 


vector of coefficients obtained by projecting the shape X, onto the Wavelet-ICA basis 


matrix Drnick 3 


Yi —l V1 Vir); ie (5.13) 


5.2.2.2 Feature selection 

The main goal of this step is to find the subsets in the Wavelet-ICA representation that 
best capture the pathological information specific to every vertebra of the lumbar spine. 
The feature-selection step benefits from the sparsity and locality of the multi-scale model 
to obtain subsets of features that are not only small in dimension, but also spatially 
localized and have high discriminating power. This step aims at increasing the 
generalization ability of the classifier and also allows for efficient multi-vertebrae 


classification in the Wavelet-ICA domain. 


First, we perform basis sorting according to spatial locality by arranging the basis 
functions into sets of localized deformations such that each set corresponds to a specific 
vertebra. Thus, by projecting the shapes onto any specific set, we obtain a low- 
dimensional feature vector that is linked to the vertebra of interest. We also use the 
outcome of the discriminant analysis performed in section 5.2.1.2, to further select the 
features with the highest disciminability according to equation 5.14. The outcome sets of 


feature vectors are then used as inputs to the classifier. 
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A- Feature Spatial Locality (FSL): 
We arrange the basis functions into sets of localized deformations such that each set 
corresponds to a specific vertebra. The Feature Spatial Locality criterion is defined as 
follows: 

A 
Let X; describe the spatial shape variation vector corresponding to the i” component of 
the WICA feature vector. Every element in the WICA feature vector corresponds to a 


particular WICA basis function or variation mode: 


x; = (Opp ) "Tw" =, Ser. (5.14) 


where @,, : is the matrix of the Wavelet Packet best basis set. 


W : is the learnt ICA weight matrix. 


y; :is the WICA-feature vector of the input shape. 


e, : isa vector with “1” at the i" position and zero every where else. 


e,€ R™ ,M is dimension of the original shape space in spatial domain. 


AN 


We partition the shape variation vector X; into five regions corresponding to the five 


lumbar vertebrae. In each region ( 1 ), we calculate the Euclidean distance between the 


a (1) (1) 
shape variation vector X, and the mean shape x 


a (1) (1) 


ea (5.15) 


iS Oe it beech hel: 


2 


Given a Wavelet-ICA shape representation vector (y, ), the i 


element of the feature 
vector is then labelled as being spatially localized to the lumbar vertebra L,. k is given 


by equation 5.16: 
k = arg max, {d,/ =1..5}. (5.16) 


Using this criterion, we extract sub-sets of the Wavelet-ICA features that correspond to 
each vertebra. Figure 5.3 shows an example of a WICA feature element spatially 


localized to the L4 vertebra. 
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Figure 5.3 Example of a WICA-shape parameter spatially localized to L4 lumbar vertebra. 


B- Feature Discriminatory Power (FDP): 
For every vertebra, we use the discriminability metric presented in section 5.2.1.2 to 


select a subset of the Wavelet-ICA feature space that has the most discriminatory power. 


C- Fusion of feature selection methods: 

We also investigate the effect of fusion of the two previous criteria. First, we select the 
subset of features spatially localized to the vertebra under investigation. Second, we use 
the statistical discriminatory power criterion to further reduce the dimensionality of the 


feature space. 


5.2.2.3 Shape-based classification 
Next, we use the feature vectors obtained from the previous section, along with a suitable 
statistical classifier to distinguish between normal and pathological populations of the 


lumbar vertebrae. 


With each vertebra being labelled as normal or abnormal, the input to the classification 


problem is a training set of pairs of feature vectors and the corresponding labels: 
LOO ly ees Yost) cls (5.17) 
where jy, is the feature vector for the i” shape. 1, is a label that can either be “1” or 


“_1” indicating whether the vertebra of interest is normal or abnormal (i.e., with a 


pathology). 
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A statistical classifier aims at dividing the feature space into regions separated by 
decision boundaries. A training set is used to learn the parameters of the classification 
model. Various classification methods can be found in the statistical learning literature 


[119, 124]. 


Generally speaking, there are two common approaches for classification. 
The first approach, referred to as the elementary decision theory, uses the available 
training set to estimate the underlying class conditional probability functions, 
(p(y/c(y)=1) and p(y/c(y) =-1) ). The classification is then achieved by using the 
Posterior Likelihood estimate: 

c(y) =arg max, p(y/c(y) =i) p(c(y) =i), (5.18) 
where p(c(y) =i) is the prior knowledge, i=1 for the normal case, and i=-1 for the 


pathological case. 


The other classification approach involves using decision functions to estimate decision 
boundaries. Instead of imposing assumptions about the class posterior probability, this 
approach makes assumptions about the discriminant function. A discriminant function is 


defined as follows: 


f(ytT{B> yes. (5.19) 


Where T is the discrimination threshold. 
In the proposed vertebrae classification scheme, we use the Fisher classifier to classify a 


specific vertebra as normal or with osteophyte pathology. 


Fisher Linear Discriminant Classifier: 
The Fisher Linear discriminant classifier tries to separate the two classes of normal and 
pathological vertebrae by finding the direction (w ) along which the two classes are best 
separated according to the Fisher criterion, which is defined as the ratio of the between- 
class to within-class variances: 
T 2 
4 lw (LL, — LL (5.20) 


Fisher ~_ w Sw 


98 


{ , 7 : " 7 | i i] 
® 
vd Danie Ronin; Gant see nent oar) aut ve 


yy 


pores ewe! adi to SR AESTBO) aul) crest! ob ean 


“sess ott! anteris “yf fe senate si) at et wed ee by) 


Pa 
oeantteaghs Wl naif ain tg noun owe iv 2ohinene vi : ) 
IRF SO ae vate cote, a eA af Cs aon vs 
ee ee 
wif ini vd beverdas aa ar “poinghiinaats sat { (fies “cup ean i + 


aa A 


- | -owpeniaes bbe 
(Ri?) ee ee songs Sige 
i bap 3289 tical otter 1] aati’ iw nih io eert ai ice 


= 
' 


‘ 


noizrosh sindiies OF enOngnETl agieioob tia soiliomt rossoriqyyie nolan 
ait? .Vinkicndo Wm Aes. ve wily Sis suds enw igrateat gancgied we bate 


sf HOA deeming ty ye oy gif es ag remon tn pe i" 


~“iwal aa i 


bn Isumom to eskapts cw of i 0 wd ayid Wikicanko truscviersieei wana . 
i A 
1298 en coals oue-oelt, % gros (wy nbiinevity cals gisibee “a seenignn ott 
“nssonad oi Yo aber on ak ih ie diel ots andar oil ssadeal 16 De. 
pee Sp, olin a" 


(kz) — 


n-| 


Se m> n> |, (5.21) 
1 2 


where W is the direction of maximum discrimination, 
H,,/4, are the mean values for the class of normal and pathological vertebrae 
respectively. 


S,, is the within-class scatter matrix. 


De and > are the covariance estimates for the normal and pathological 
1 2 


classes. 


We close this section by commenting that the common challenge among various 
classification methods is the high dimensionality of the feature space with respect to the 
size of the training set available. In this chapter, we try to tackle this issue through the 
locality of the shape model, and the feature selection stage. Sets of low-dimensional 


feature vectors are used to describe each vertebra of the lumbar spine. 


5.3. Experiments and Results 


In this section, we evaluate the performance of the proposed vertebral shape analysis and 
classification schemes. Experiments are conducted by using a dataset of lateral x-ray 
images of the human lumbar spine [108]. For each x-ray image, the lumbar vertebrae L1- 
LS are annotated by experienced radiologists. Each vertebra is labelled as normal or 
abnormal (i.e., with osteophyte pathology). We focus on the anterior osteophyte (AQ) as 
the pathology of interest. Anterior osteophytes are bony growths along the anterior edge 
of the vertebra bodies. Figure 5.4 shows an example x-ray image of a lumbar spine with 


anterior osteophyte pathology. 
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Figure 5.4 Lumbar spine x-ray image from the National Library of medicine dataset [108]. 


We use a set of 150 outlined contours of lumbar vertebrae L1-L5, with each vertebra 


labelled as normal or abnormal. Figure 5.5 shows vertebrae with anterior osteophytes. 50 


% of the vertebrae were labelled as normal, and the rest as abnormal. A set of 100 


contours is used for training, and the remaining 50 contours are used for testing. The 


following summarizes the steps of the classification scheme: 


ily 


ee 


Construct the multi-scale shape model, and obtain the shape parameters, as 
described in section 4.3. 

Use the shape parameters of the model as feature vectors. We refer to these shape 
parameters as Wavelet-ICA (WICA) features. 

Use the training set to calculate the mean and standard deviation of the feature 
vectors for the two classes of vertebrae (normal and pathological). 

Normalize the feature vectors in the training set. 

Use the set of feature vectors to train the Fisher classification rule. 

Evaluate the accuracy of the classification scheme. 


Repeat steps 1-5 for 20 randomly generated training and testing sets. 
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Figure 5.5 Vertebrae with anterior osteophytes. (a) Label points selected by expert radiologist. Points 8, 
9 characterize osteophyte pathology, (b) two lumbar vertebrae with the 9-points model (c) Contour of 
Lumbar vertebra with upper left-hand corner osteophytes, (d) Contour lumbar vertebra with upper 
and lower left-hand corner osteophytes. 


The experiments are conducted to achieve the following goals: 

- Evaluate the ability of the multi-scale shape model to predict pathological vertebral 
deformities. 

- Evaluate the ability of the proposed Wavelet-ICA shape analysis scheme to explain the 
anatomical differences between normal and pathological vertebral populations. 


- Evaluate the accuracy of the proposed multi-vertebrae classification scheme. 


5.3.1 Shape prediction in presence of pathology 

We investigate the ability of the multi-scale shape model to capture localized subtle 
pathology-related variations. Figure 5.6 shows examples of deformation modes that 
capture variations in the L4 vertebra of the lumbar spine. Using the leave-one-out 


experiment, we investigate the ability of the model to predict unseen vertebrae with 
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various degrees of anterior osteophytes pathology. Figure 5.7 compares the prediction 
capability of the proposed model, and that of a PCA-based model [36]. Our model tends 


to better reconstruct the shapes of normal and abnormal vertebrae (i.e., with anterior 


pathology). 


Figure 5.7 Shape prediction for normal and abnormal vertebrae by using PCA-based model [36] (top 
row), and the proposed model (bottom row). (a,d) are normal vertebrae, (b,c,e,f) are vertebrae with 
osteophytes pathology. 
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5.3.2 Evaluation of Wavelet-ICA shape analysis framework 
We evaluate the ability of the Wavelet-ICA shape analysis framework to explain group 
differences between normal and pathological vertebrae. We also compare the proposed 


analysis framework against a PCA-based shape analysis scheme. 


In section 5.2, we use univariate regression analysis to relate the modes of deformation of 
the multi-scale shape model to a clinical dichromatic outcome variable. In our 
experiments, the clinical variable is the anterior osteophyte pathology. Figures 5.8-9 
show the resulting regression coefficients of the localized deformation modes and the 
corresponding significance levels for the lumbar vertebrae L1-L5. For the vertebrae L1- 
L5, the percentages of localized deformation modes that have a statistically significant 
relationship with the clinical variable (p<0.05) are as follows: 55 %, 58 %, 38.8 %, 
65.52%, and 57.7 %, respectively. 


For all vertebrae investigated, an average of 55 % of the spatially localized deformation 
modes was found to have a statistically significant relationship with the osteophyte 
clinical variable. This finding suggests that the proposed localized, pathology-related 
shape parameters can be further exploited in applications such as pathology-based image 


retrieval. 
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Figure 5.8 Regression coefficients from the investigation of the relationship between the localized 
deformation modes and the osteophyte pathology for lumbar vertebrae L1-L3.Statistical significance 
levels of the localized deformation modes are indicated by the color code: black (p<0.001), dark gray 

(p<0.01), and light gray ( p<0.05) 
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Figure 5.9 Regression coefficients from the investigation of the relationship between the localized 
deformation modes and the osteophyte pathology for lumbar vertebrae L4-L5.Statistical significance 
levels of the localized deformation modes are indicated by the color code: black (p<0.001), dark gray 

(p<0.01), and light gray ( p<0.05) 


5.3.3 Evaluation of the multi-vertebrae classification scheme 

The performance of the proposed classification scheme is evaluated by using percentages 
of the correct class classification, and the Receiver Operating Characteristic curves 
(ROC). The feature discriminatory power (FDP) criterion is used for the feature selection 
step. We compare the performance of the proposed classification scheme against a PCA- 
based classification scheme. We use a training set to construct a PCA-based basis of 
eigenshapes. These are similar to the basis functions of the Point Distribution Model 
(PDM) proposed by Cootes et al. [35]. The features are extracted by projecting the 


vertebral shapes onto the space of eigenshapes. The features are then ranked in 
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descending order of the eigenvalues. This classification scheme is similar to the shape- 


based classification method proposed by Roberts et al. [92]. 


We measure the ability of the proposed classification scheme to discriminate between 
normal and pathological vertebrae by using the True Positive Rate (TPR) and the True 
Negative Rate (TNR). A True Positive Rate is defined as the percentage of normal 
vertebrae that are correctly classified as normal. The True Negative Rate is the percentage 


of abnormal vertebrae that are correctly classified as abnormal: 


Aloe 
TP +FN Cee 
ee (5.23) 
TN + FP 


Table 5.1 summarizes the correct normal and abnormal classification rates for the 
vertebrae of the lumbar spine. The results are reported for the proposed classification 
scheme and the PCA-based scheme. The Fisher linear discriminant classifier is used in 
both frameworks. In comparison to the PCA-based classification scheme, the proposed 
method has achieved significant improvements in the classification rates. Table 5.1 shows 
improvements of 14.8 % and 53.60 % in the correct classification rates of normal and 
pathological L1 vertebra. The proposed scheme has achieved improvements of 35.37 %, 
and 33.50% in the correct classification rates of normal and pathological L2 vertebra. For 
the L3 vertebrae, the improvement percentages are 40.12% and 6.55% for normal and 
pathological classes, respectively. Improvement percentages of 29.74 % and 11.75 % are 
obtained in the correct classification rates of the L4 vertebra. For the L5 vertebrae, the 
improvement percentages are 18.28% and 16.82% for the normal and pathological cases, 


respectively. 


For the entire lumbar spine L1-L5, the average and standard deviation classification rates 
of the proposed method are 96.08 % and 3.47 % for normal vertebrae, and 94.58 % and 
4.82 % for abnormal vertebrae, respectively. In comparison to the PCA-based 
classification scheme, our method has a more consistent performance over all the 
vertebrae of the lumbar spine. The localized Wavelet-ICA features tend to better quantify 


the concavities in the vertebral body across the lumbar spine. 
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Table 5.1 Correct classification rates of the proposed classification scheme, and PCA-based 


classification scheme for detection of anterior osteophyte in lumbar vertebrae. 


iets Fates Proposed classification method PCA-based classification method 


Vertebra correct normal % correct abnormal % | correct normal % | correct abnormal % 


Index (TNR) (TPR) (TNR) 


97.92 


Receiver operating characteristics (ROC) curves are also used to assess the ability of the 
proposed framework to discriminate between normal and pathological vertebrae. The 
ROC is a plot of the True Positive Rate (TPR) versus the False Positive Rate (FPR) as the 
threshold varies. In the statistical learning literature, The True Positive Rate (TPR) is also 
referred to as the sensitivity, and the False Positive Rate is defined as (1-specifity). The 
ROC curve provides a good method for visualizing the effect of changing the threshold 
on the classification accuracy. The Area under the ROC curve (AUC) is also used in the 
evaluation process. The AUC measures the seperability of the distributions of the normal 


and pathological classes. 


Figures 5.10-14 show the ROC curves for the proposed classification scheme (WICA- 
based) and the PCA-based scheme [92] for the vertebrae of the lumbar spine L1-L5. The 
areas under the curves (AUC) for the ROC curves are given in Table 5.2. Table 5.3 


compares the specefity of the two classification schemes at sensitivities of 0.80 and 0.90. 


The proposed WICA-based classification scheme appears to dominate the ROC curves 
for the entire lumbar spine L1-L5. At a sensitivity of 80%, the proposed WICA-based 
classification scheme achieves an average specefity of 99.25 %, versus an average 


specefity of 72.9 % for the PCA-based classification scheme. At a sensitivity level of 
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90 %, the WICA-based classification scheme achieves an average specefity of 97.17 % 
versus 52.88 % for the PCA-based model. 

At a False Positive Rate (FPR) of 0.2 (i.e., the specefity equals 80 %), the proposed 
classification scheme achieves an average sensitivity of 95.2 %, versus 78.2% for the 
PCA-based scheme. At a False Positive Rate (FPR) of 0.05 (i.e., the specefity equals 95 
%), the proposed classification scheme achieves an average sensitivity of 93.2 %, versus 
59.2% for the PCA-based scheme. In terms of the Area under the Curve (AUC), the 


proposed classification scheme has achieved an average improvement of 14.2 %. 
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Figure 5.10 ROC curves for detection of anterior osteophytes in L1 vertebra 
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Figure 5.11 ROC curves for detection of anterior osteophytes in L2 vertebra 
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Figure 5.12 ROC curves for detection of anterior osteophytes in L3 vertebra 
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Figure 5.13 ROC curves for detection of anterior osteophytes in L4 vertebra 
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Figure 5.14 ROC curves for detection of anterior osteophytes in LS vertebra 
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Table 5.2 Areas under curves for ROC curves of the proposed classification scheme 
and the PCA-based classification scheme. 

AUC 

WICA-based classification 


Vertebra Index 


AUC 
PCA-based classification 


: 3 
Geddes, 


0.9833 0.8797 
L5 0.9826 0.8933 
Mean 0.9776 0.8561 


Table 5.3 Specefity at various sensitivities for the Lumbar vertebrae L1-L5 . 


adn iil 9a sma se 
>) 


100 % 86.4 % 99.83 % 
100% 72.6 % 99.98 % 63.4% 


5.3.2 Effect of feature selection on classification accuracy 

This section investigates the effect of the feature selection stage on the overall 
classification accuracy. We evaluate the performance of two feature selection criteria: the 
feature spatial locality (FSL) and the feature discriminatory power (FDP). Table 5.4 
shows the average correct classification rates achieved by using different feature selection 


approaches. 
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Table 5.4 Effect of various feature selection approaches on the class classification 
accuracy. 
Feature selection Method | correct classification rate | correct classification rate 


of normal vertebra of abnormal vertebra 


Using spatial locality 94.966 % 91.776 % 
criterion only 


Using discriminatory 94.58 % 

power criterion only Ur icy 
Fusion of the two 97.528 % 94.522 % 
selection criteria wey 


Based upon the figures in table 5.4, the following comments can be made: 


- In general, the feature selection step has significantly improved the 


classification rates of normal and abnormal classes. 


- By using the feature spatial locality criterion (FSL), we achieve an increase of 
7.14 %, and 8.32 % in the correct classification rates of normal and abnormal 


vertebra, respectively. 


- By using the feature discriminatory power criterion (FDP), we achieve an 
increase of 8.25% and 11.12% in the correct classification rates of normal and 


abnormal vertebra, respectively. 


- A limited performance gain is achieved by the integration of the two selection 


criteria. 


5.4 Discussion 

The experiments in section 5.3 have demonstrated the ability of the proposed Wavelet- 
ICA shape analysis and classification framework to capture subtle discriminating features 
of the vertebral bodies. The proposed classification scheme has achieved an average 
classification rate of 97.53 % for normal vertebrae, and 94.52 % for vertebrae with 
anterior osteophyte pathology. These results compare well to those from related vertebral 


classification methods in the literature. In [120,121], researchers from the National 
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Library of Medicine presented classification schemes for lumbar and cervical vertebrae 
against anterior osteophytes. In [120], the authors reported average correct classification 
rates of 90.46 for normal lumbar vertebrae, and 89.48% for abnormal vertebrae or those 
with anterior osteophyte pathology. In [121], the authors reported classification 
accuracies of 84.44 % for normal cervical vertebrae, and 77.08 % for pathological 
cervical vertebrae. Moreover, we are able to perform multi-vertebrae classification due to 
the spatial locality of the Wavelet-ICA features. The sparsity and spatial locality of the 
Wavelet-ICA representation have enabled us to extract subsets of the feature space that 
are linked to each of the five lumbar vertebrae (L1-L5). This result allowed us to perform 


accurate multi-vertebrae classification for the complete lumbar spine. 


We have also investigated the potential gains of the feature selection step on the Wavelet- 
ICA domain prior to classification. In general, the feature selection step has significantly 
improved the overall classification accuracy for both normal and abnormal vertebrae. We 
explored two criteria for selecting feature subsets: the feature spatial locality (FSL) and 
the feature discriminatory power (FDP). The sequential fusion of the two criteria has 
resulted in an increase of 9.39 % and 11.52% in the correct classification rates of normal 
and abnormal vertebrae, respectively. Better results were achieved by using the feature 
discriminatory power (FDP) criterion than by using the feature spatial locality criterion 
(FSL). This difference is attributed to the fact that the feature discriminatory power 
criterion is a more general criterion that allows us to select significant spatially localized 


features, as well as other discriminating features from neighbouring vertebrae. 


5.5 Conclusion 
In this chapter, we investigated the potential of the novel multi-scale shape model in the 


quantification and diagnosis of vertebral deformities. 


We presented a novel vertebral shape analysis and classification framework using 
localized, pathology-related Wavelet-ICA parameters. Within a Wavelet-ICA regression 
analysis framework, we were able to establish a statistically-relevant relationship between 
the shape parameters and clinical information. Our experiments demonstrated the ability 
of the proposed framework to discriminate between classes of normal and pathological 
vertebrae. This ability opens the door for other anatomically meaningful interpretations of 


the modes of deformations of the multi-scale shape model. Furthermore, we presented a 
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novel method for computer-assisted multi-vertebrae classification, which incorporates the 
following main steps: localized shape modeling, feature selection, and_ statistical 
classification. High classification accuracies were achieved in the experiments. Thanks 
to the locality of the features, the new classification scheme can simultaneously assess the 


condition of the complete lumbar spine against the pathology of interest. 
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6. Prior-driven Segmentation Using Local 
Appearance Profiles in Contourlet Domain 
and Multi-scale Shape Model 


6.1. Introduction 
In this chapter, we attempt to bring recent advances in the fields of image processing and 


computational harmonic analysis to the field of medical imaging. Within this context, we 
propose novel methods for the segmentation and retrieval of anatomical structures by 


using Contourlet-based local appearance profiles and the multi-scale shape prior. 


The segmentation of anatomical structures in medical images is an essential step for 
various medical applications such as computer-aided diagnosis, 3D visualization, and 
surgical planning. Manual segmentation methods are both time-consuming and non- 
reproducible. Hence, developing efficient computer-assisted segmentation methods has 
been a long-standing goal within the medical imaging research community. In this work, 
we are particularly interested in the segmentation of human vertebrae in digital 
radiographs. Accurate vertebral segmentation plays a vital role in the proper assessment 
of various vertebral abnormalities. Nevertheless, efficient vertebrae segmentation 
continues to be a challenging task for a number of reasons. First, poor image quality and 
noisy and incomplete data tend to affect the ability of low-level methods to accurately 
detect the boundary of the vertebral body. Second, the complexity of the anatomical 
structure of the human spine and the wide range of possible deformations tend to limit the 
ability of top-level prior-based methods to generalize well to unseen examples. Thus, an 
efficient vertebrae segmentation algorithm needs to keep the balance between the two 
extremes. On the one hand, it should have enough prior knowledge to reduce the 
sensitivity to noise and image quality. On the other hand, it should still be able to detect 


valid deformations. 


Another medical application of interest in this chapter is content-based image retrieval 
(CBIR), which refers to the retrieval of images in the database by using visual features 
extracted from the images, rather than textual information. In the past few years, CBIR 
has received increasing attention within the research community for two main reasons: 

(i) the recent advances in digital imaging technologies and data compression methods, 


and (ii) the substantial growth in the number of digital images in hospitals and the 
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increasing popularity of Picture Archival & Communications systems (PACs). Image 
retrieval systems have the potential to serve many educational and diagnostic purposes 


within the medical imaging community. 


For two major research groups, the segmentation and retrieval of spine radiographs have 
been the main focus. The first group is the ListerHill National center of Biomedical 
Communications, which is an R&D division of the US National library of Medicine 
(NLM). The NLM has made available a set of 1,700 cervical and lumbar spine digital 
radiographs with various pathologies that were collected in the second National Health 
and Nutrition Examination Survey (NANESII) conducted by the US National Center for 
Health Statistics (NCHS) [108]. These images have been used by various researchers 
worldwide who are interested in the segmentation and retrieval of spine x-rays. Another 
major effort within the field is a collaborative project carried out by the departments of 
Radiology, Medical Information, and Computing Science at the Aachen University of 
Technology in Germany. The Image Retrieval for Medical Applications (IRMA) project 
aims at developing content-based image retrieval systems for diagnostic tasks. A large 


database of radiographs has been made available to the research community [125]. 


The rest of this chapter is organized as follows: section 6.2 highlights the main 
contributions presented in this chapter. Section 6.3 presents the details for a novel 
appearance-based salient point detection and matching algorithm. Section 6.4 presents a 
novel vertebrae segmentation method. Section 6.5 presents the details for a novel 
Content-based medical image retrieval system. Our experimental results are presented in 


section 6.6, and our conclusions are drawn in section 6.7. 
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6.2. Contributions 
Figure 6.1 illustrates the general framework for the methods presented in this chapter. 


The following outlines the main contributions presented in this chapter: 


In an attempt to bring recent advances in image processing and computational 
harmonic analysis to the field of medical imaging, we present a novel salient 
point detection and matching algorithm based upon Non-Subsampled Contourlet 
transform (NSCT), a truly 2D multi-scale directional image analysis tool. In 
contrast to low-level salient point detection methods, our method integrates prior 
knowledge about the local appearance of the salient points. The proposed method 
has proven successful in detecting a reduced number of robust and stable salient 
points that are most relevant to the structure of interest. The Contourlet-based 
salient point detection method is central in the new methods for the segmentation 


and retrieval of radiographs. 


In an attempt to overcome challenges in the segmentation of vertebral structures, 
we present a novel segmentation method using shape-driven salient point 
matching in the contourlet domain. At the low-level, the method uses the Non 
Subsampled Contourlet Transform (NSCT) to detect robust and repeatable salient 
points in the spine x-ray. Rather than using global texture models to drive the 
segmentation process, we construct contourlet-based local appearance profiles at 
salient points of the vertebral bodies. A multi-scale shape prior is also 
constructed as described in chapter four. During the testing phase, the contourlet- 
based local appearance model is used to detect the vertebral boundaries in the test 
x-ray image, while the localized multi-scale shape prior is used to drive the 
segmentation process. The new method is shown to have high performance in 


terms of robustness to noisy images and generalization to unseen cases. 


We have also used the proposed contourlet-based salient point detection method 
in a content-based image retrieval task. The new method uses the robustness and 
stability of the extracted salient points, and the information-rich Contourlet 


descriptors to retrieve medical images of similar classes. 
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Low Level Tasks Learning Prior Knowledge 
Mult-scale directional analysis Muki-scale Shape Prior 


Salient point detection 


Image Segmentation 


Knowledge-driven High Level Tasks 


Figure 6.1 General framework for knowledge-driven methods presented in this chapter 
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6.3 Multi-scale salient point detection in Contourlet domain 
This section presents the details for a new method for detecting salient points in medical 


images based upon Non Subsampled Contourlet analysis. This method is central to the 


segmentation and retrieval methods presented in sections 6.4 and 6.5, respectively. 


6.3.1 Background 

Based upon our understanding of the Human Visual System (HVS), it is believed that the 
human vision tends to be selectively attentive to localized “different” regions of the 
visual field. The detection of these salient regions enables the brain to perform other 
higher-level visual tasks such as pattern recognition and object tracking. In the literature, 
these regions of interests are often referred to as interest points, key points, or salient 
points. No exact definition exists for what makes a specific region be considered as a 
salient region. However, the expression is generally used to refer to regions of the visual 
field with localized distinctive features such as edges, corners and blobs. All these are 


thought to provide cues for recognizing shapes and objects in complex visual scenes. 


Inspired by the Human Visual System (HVS), salient point detection methods have been 
used as a low-level image processing procedure in many computer vision tasks including 
registration, tracking, robot navigation, and medical imaging. The performance of the 


salient point detection algorithm has a great impact on subsequent higher-level tasks. 
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Generally speaking, salient points need to be well defined and stable under different 
transformations. Various salient point detection algorithms are often evaluated in terms of 
repeatability, which is defined as the ability of the method to detect the same salient 
points under various transformations. Popular transformations are scaling, rotation, and 


change in view. 


Several salient point detection approaches were proposed in the literature. In [126], 
Schmidt et al. grouped salient point detection methods according to the theory of 
operation into three categories: edge-based, intensity-based, and biologically-inspired 
methods. One of the most famous salient point detectors is the Harris corner detector 
[127], which detects corners in an image by using the autocorrelation matrix. Despite 
being useful in relatively simple scenes, the Harris detector suffers from a number of 
limitations. First, it is based on single scale analysis. Second, it suffers from poor 
localization ability. The SIFT algorithm is another important salient point detection 
method [128]. It extracts salient points by finding the local maxima of the response of the 
Difference-of-Gaussian (DOG) operator. Biologically inspired salient point detectors 
focus on directional and multi-scale analysis of images. Tian [129] and Loupais et al. 
[130] proposed wavelet-based salient point detectors. The wavelet transform has a 
number of appealing properties such as multi-scale representation and spatial-frequency 
localization. Nevertheless, wavelets suffer from a lack of shift invariance and limited 
orientation selectivity, both are important properties for the efficient detection of salient 
points in complex scenes such as medical images. In [131], Itti et al. proposed a 
biologically inspired bottom-up saliency detector based upon Gabor filters. In [132], the 
authors presented a salient point detector based upon the phase congruency of Log-Gabor 
filter response. In [133], Gao et al. proposed a multi-scale corner detection method based 
upon the Log-Gabor transform. In comparison to wavelets, Gabor filters tend to offer 


better directional and multi-scale analysis. 


6.3.2 Methodology 

In this chapter, we are interested in salient point detection and matching in medical 
images. Detecting robust and stable salient points of interest in medical images is a 
challenging task, because of the low quality of the images, and the complexity of the 


anatomical structures of interest. 
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In medical image analysis, an ideal salient point detector needs to exhibit the following 
characteristics: 

- The detector should be able to extract information from multiple scales and 
orientations. No strict definition exists for what constitutes a salient point in an 
image. Thus, it is logical to use biologically inspired filters in the detection 
process. 

- The localization property of the detector is also an important aspect that tends to 
affect other higher level tasks such as feature extraction, matching, and 
registration. 

- Also, the detector should be able to capture higher-level salient structures in the 
image (e.g., contours). 

- The computational aspect of the detector is important since salient point detection 
always occurs at the early stages of the computer vision system. 

- In most medical image analysis applications, we are interested in a particular 
object. Thus, it is beneficial for the salient point detector to be target-driven. It 
can be made so by integrating prior knowledge to the salient point detection 


procedure. 


In an attempt to achieve some of these features, we present a new method for salient point 
detection that is suitable for the extraction of robust points of interest in medical images. 
The new method is utilized as an early image processing stage that serves the other 
higher-level imaging methods presented in section 6.4. In this section, we propose a new 
appearance-based salient point detection method based upon Non Sub-sampled 
Contourlet Transform (NSCT). The multi-scale and directional capabilities of the NSCT 
transform are utilized to extract salient points that are robust to noise and rigid and non- 
rigid deformations. Moreover, the proposed method also utilizes prior knowledge about 
the local appearance of the object of interest. This ability allows for the detection of a 


reduced set of salient points that are most relevant to the structure of interest. 


6.3.2.1 Salient point detector 
This section presents the details for a novel salient point detection method using Non- 


Subsampled Contourlet Transform (NSCT). Using the capabilities of the NSCT transform 
to capture higher-order image structures, we are able to extract robust salient points in 


medical images. In the context of our work, salient points refer to significant corners and 
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smooth contours that can be used as cues during the segmentation and retrieval processes. 
The new method is based upon the NSCT and the second moment matrix. We will refer 
to the new method as NSCT-SMM. The input image is first decomposed by using the 
NSCT transform at different scales and orientations. The magnitudes of the 
decomposition at all scales and orientations are then formulated into a second moment 
matrix. Following classical moment analysis, [134,135], intensity variation in the 
contourlet-based feature image is characterized by the two eigen axes of the second 
moment matrix. The strengths of the maximum and minimum moments are used to infer 
the existence of significant corners and contours, respectively. Figure 6.2 shows a block 


diagram for the proposed method. 


Input Image 


Multi-scale multi-directional 
Contourlet Decomposition 


Formation of second moment 
matrix 


Generation of the feature strength 
image 


Non-maximum suppression 
& Thresholding 


Salient points 


Figure 6.2 Flowchart of the proposed multi-scale contourlet-based salient point detector 
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The following explains the steps for the proposed method: 


a. Non Sub-sampled Contourlet decomposition: A multi-scale directional 
decomposition is achieved by using the NSCT transform. The input image /(x, y) 


is decomposed into 4 scales. In the first scale level, we use a 4-level Directional Filter 
Bank (DFB) to get 16 directional frequency partitioning. In the second and third scales, 
eight directional frequency partitions are obtained by using a 3-level Directional Filter 


bank. In the final scale level, 2-level directional filter-banks are used. Let 


{B}, J =9,1,2,3 denote the set of response images that results from the non sub- 
sampled pyramid decomposition step, where J is the scale level. At each scale 4 , the 
response image B, is further decomposed by using a l j7level non sub-sampled 
Directional Filter Bank to produce a set of multi-scale multi-directional contourlet sub- 
band images {w,,} , where k = 0,1,..., Die and j =0,1,2,3. Figure 6.3 shows the 


oriented contourlet basis functions for two different scales. Due to the flexibility of the 
transform, we can decompose the image by using different angular resolutions at every 


scale. 
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Figure 6.3 NSCT basis functions for two scales with different angular resolutions at each scale 


b. Following classical moment analysis equations, [134,135], we compute the 
following: 
= 2 (6.1) 
as M 
a= ys (Wi, lcos(A, )) 
j=0 k=0 
M_ Nj (6.2) 
b= >. >||W,, {cost )sin@, ) : 
j=0 k=0 
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((W,., [sin(8,)) (63) 


where 


J is the scale of the decomposition. 
M is the maximum number of scales. 
k is the orientation at scale J] ; 
N j is the number of decomposition orientations at scale J ] 


Wi 


is the magnitude of the non subsampled contourlet coefficients at scale J and 


orientation k , 


i= th eth 

aes _ (k-)z is the angle of the k orientation at the J scale level. 
> N ; 
ay 


C: The second moment matrix is then constructed in the contourlet domain as 


a b 
M a (6.4) 
Pua 


d. The eigenvalues of the second moment matrix in equation 6.4 is then calculated 


| 1 
E, =r lgac ava? alae) (6.5) 
| ee 2 (6.6) 
E, =,latc]+ 4b ae eee 


The eigenvalue decomposition of the second moment matrix M provides a description of 


as follows: 


the gradient structure in the image. The calculated eigenvalues E, and E, indicate the 


certainty of along the directions of their associated eigenvectors [134]. 
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e. Compute the Contourlet-based feature strength image: Our main goal is to use the 
salient point detector for the segmentation of the anatomical structures in x-ray images. 
We would like to extract salient points which correspond to the significant corners and 
smooth contours of the object of interest. Thus, we define a contourlet-based feature 


strength image ( F’ ) as a weighted sum of the two eigenvalues of the second moment 


matrix. The smaller eigenvalue (£, ) is a measure of the corners in the image, and the 


larger eigenvalue ( £, ) is a measure of the contours and edges in the image [134]: 


F =@E,+(-Q@E,. 6.7 


Figure 6.4 shows an example of x-ray image of the lumbar spine and the corresponding 
Contourlet —based feature strength map. 


Figure 6.4 Example x-ray image and corresponding Contourlet-based feature strength image. 
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f. Non-maximum suppression and thresholding: Salient points are finally extracted from 
the Contourlet-based feature image by using non-maximum suppression and 


thresholding. This step selects local maxima points as the salient points. 


6.3.2.2 Incorporation of local appearance knowledge 
As discussed before, salient point detectors usually appear at early stages of computer 


vision tasks. Thus, the number of detected points greatly affects the efficiency of the 
whole application. Most of the methods in the literature rely on low-level operations 
where no prior knowledge is incorporated in the detection process. However, in many 
cases, we have prior knowledge about the object of interest. Thus, our goal is to perform 


a sort of object-guided search for the salient point, rather than a pure bottom-up search. 


We propose to further enhance the proposed salient point detection method described in 
section 6.3.2.1 by incorporating prior knowledge about the object of interest. Doing so 
has the advantage of reducing the number of detected salient points for subsequent 
complex tasks. This procedure will increase the efficiency of the overall computer vision 


task at hand (i.e. segmentation). 


The new method benefits from the rich multi-scale directional information offered by the 
Non Subsampled Contourlet decomposition to create localized descriptors for the salient 
points of the object of interest. The local descriptors are then modeled by using Mixture 
of Gaussians to construct a local appearance model of the target salient points. The 
following summarizes the steps for the new appearance-based salient point detection 
method: 

1. We start with a training set of x-ray images with a number of ground truth salient 
points for the object of interest. For each ground truth salient point, we extract a 
local appearance profile based upon NSCT features. 

2. Using the set of NSCT descriptors, we construct local appearance profiles for the 
target salient points by using Mixture of Gaussians. 

3. For anew image, we extract the candidate salient points by using the bottom-up 
approach explained in section 6.3.2.1. 

4. A reduced set of salient points is then obtained by estimating the likelihood that 


every detected point corresponds to the learnt appearance model. 
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The following presents the details for the proposed method: 

i. Feature Extraction 

Given a set of M training images with a ground truth segmentation for the object of 
interest, we extract a set of N ground truth salient points from each image. First, we use 
the method described in section 6.3.2.1 to detect the salient points in the image. Next, 
ground truth salient points that correspond to the object of interest are extracted by using 
the ground truth segmentation. A ground truth salient point is defined as a salient point 
that has a point-to-curve distance of less than 5 pixels. Figure 6.5 shows two x-ray images 


with the ground truth salient points that define the border of the lumbar vertebrae L1-L5. 


Figure 6.5 Example of ground truth salient points that define the lumbar spine in the x-ray images 


For every ground truth salient point (P.), we extract a local descriptor based upon the 


Non Subsampled Contourlet decomposition. The local descriptor is constructed by using 
the NSCT feature images obtained in section 6.3.2.1. We use a square grid of NxN (N=7) 
points. The ground truth salient point is at the center of the grid. Variance-based features 


are then extracted from the NSCT directional sub-bands at every scale, excluding the 


low-pass response. 
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For every NSCT feature image, we calculate the average absolute deviation of the gray 
values of the square grid centered at the salient point id : 


fee peo 


i ft 5 ale i 
E; (a,b) —— ee xi0.k)— mi (6.8) 
N N 
l=a-—— k=b-— 
2 2 
where X; denotes the a  NSCT feature image, ve (1,k) is a coefficient at location 
(1,k) of the NSCT feature image, and Le. is the mean of the coefficients in the NxN 


square grid whose center is the salient point li 


The contourlet-based local descriptor for the salient point P is then defined as 


i pi i i 4T 
ig =[E, , Boe ey, ess, | ' (6.9) 
where 1 is the average absolute deviation for the “i NSCT sub-band, and D is the 


total number of NSCT sub-bands. 


Figure 6.6 summarizes the complete procedure for the construction of the Contourlet- 
based local descriptor. The outcome of this stage is a set of K=NxM_ Contourlet-based 
feature vectors extracted from all the ground truth salient points in the training set. Next, 
we use Mixture of Gaussians to construct a local appearance profile for the salient points 


of the object of interest. 
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Figure 6.6 Extraction of Contourlet-based local descriptor for a salient point. 


ii. Mixture of Gaussians Model 


Let {F.} be the set of Contourlet-based local descriptors for the ground truth 


k=1..K 
salient points in the training set. K is the total number of ground truth salient points, and 


F, is the local descriptor of the k salient point. 
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We model the set of local descriptors as a mixture of N, Gaussians. The Gaussian 


mixture model is a flexible parametric distribution. When N_, is large enough, any 


m 


distribution can be modelled as a sum of N,, Gaussian functions [119]. Figure 6.7 shows 


the building blocks of the Gaussian Mixture model. 


Figure 6.7 Building blocks of the Gaussian Mixture Model 


The Gaussian Mixture model for the local descriptors is given by 


Nin 
PE acon (6.10) 
=| 


where NV : Number of Gaussian functions in the model. In our model, we have NV =4. 


J, (F) : Multivariate Gaussian density functions characterized by the mean ( JZ, ) 


and covariance matrix ( Din I 


f(F)= exp(->(F-u,)'(X, F-H,)}- GD 


] 
(27)? Oe ee 


Nie 
a, is the mixture of weights, where Dia, =1, 


n=l 
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The parameters of the Gaussian Mixture model are as follows: 
ae a 62 
OZ IG peat = LN, (6.12) 


Using the training set {F,} we estimate the parameters of the Gaussian Mixture 


kale Kow 
model (GMM) by maximizing the likelihood function. The GMM likelihood function is 
given by 


PCE /0)=TT,5 p(F, /9). (6.13) 


We use the Expectation-Maximization algorithm (EM-algorithm) to estimate the model 


parameters that maximize the likelihood function given in equation 6.13: 


0 =argmax,{p({F,},_, , /)}. re 


The E-M algorithm estimates the parameters that maximize the likelihood function by 
iterating through two main steps: Expectation and Maximization. The model parameters 


are iteratively updated in the following manner: 


E-Step: 
Using the current estimate for the model parameters, we evaluate the posterior 


th 


probabilities of the n Gaussian functions given the k “” local descriptor: 


Dit) ees iow (6.15) 
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M-Step: 


The parameters of the Gaussian model are iteratively updated in the following manner: 


peer '%s 
a, See a) (6.16) 


K 
Dinp iF,) (6.17) 


K A 
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iii. Appearance-based salient point detection 
The set 0= (ay, }, n=1..N,, represents the estimated parameters for the 


local appearance model of the ground truth salient points. This model is used to bias the 
low-level salient point detection procedure described in section 6.3.2.1. For every 
detected low-level salient point, we extract a contourlet-based local descriptor by using 
the method shown in Figure 6.6. Next, we estimate the confidence level that the detected 
salient point actually belongs to the local appearance model learnt during the training 
phase. Following the Bayes decision rule [119], we use the posterior probability as the 


measure for the confidence level. 
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6.4 Vertebrae segmentation using contourlet-based salient point 


matching and multi-scale shape prior 


This section presents the details for a new method for vertebrae segmentation in x-ray 
images. The human spine is a complex articulated structure exhibiting local and global 
deformations. In an attempt to overcome the challenges in the segmentation of vertebral 
structures, we present a novel segmentation method using shape-driven salient point 
matching in the Contourlet domain. The new method incorporates the contourlet-based 
salient point detection method and the multi-scale shape prior presented in chapter four. 
At the low level, the method uses the Non Sub-sampled Contourlet Transform (NSCT) as 
the multi-scale and multi-directional analysis tool. Using the method described in section 
6.3, we are able to extract robust and repeatable salient points related to the lumbar spine. 
Rather than using global texture models to drive the segmentation process, we construct 
contourlet-based local appearance profiles at the salient points of the vertebral bodies. 
Non sub-sampled Contourlet sub-bands offer rich localized textural and geometric 
information that help localize the object of interest during the search process. A multi- 
scale shape prior is also constructed as described in chapter four. During the testing 
phase, the Contourlet-based local appearance model is used to detect vertebral boundaries 
in the test x-ray image. The localized multi-scale shape prior is then used to drive the 
segmentation process. The new method has proven to be successful in terms of its 


robustness to low image quality and its ability to generalize to unseen cases. 


6.4.1 Methodology 
The overall model learning and segmentation framework is outlined in Figure 6.8. The 
proposed method incorporates the following: 

i. Construction of local appearance profiles by using Non-subsampled Contourlet 


descriptors. 


ii. Construction of a multi-scale shape prior by using a best-basis selection framework 


and Independent Component Analysis (ICA). 


iii. Matching prior models to unseen images by using a shape-driven salient point 


matching approach. 
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(b) 
Figure 6.8 Proposed vertebrae segmentation algorithm. (a) Model learning framework (b) 


Segmentation framework 


6.4.1.1 Construction of local appearance profiles 
Given a training set of lumbar spine x-ray images, we construct local appearance models 


for selected salient points that outline the vertebrae bodies. Rather than constructing the 
local appearance profiles for every landmark point of the shape, we build the profiles for 
only a few selected salient points. This approach allows for a robust and accurate model 


matching procedure. We refer to the selected salient points outlining the vertebral shape 
as the characteristic salient points CE, ). Figure 6.9 shows examples of x-ray images with 


the outlined contour and selected characteristic salient points (seven points per vertebra). 
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Figure 6.9 Examples of training x-ray images with outlined contours and characteristic salient points 


The following summarizes the procedure to construct the local appearance profile: 


i. Contourlet-based local texture descriptors 

- First, every image in the training set is decomposed by using the NSCT transform. We 
use a 4-level NSCT decomposition. In the first level, we use a 4-level Directional Filter 
Bank (DFB) decomposition to get 16 directional frequency partitionings. In the second 
and third levels, a 3-level DFB is used to get 8 directional frequency partitionings. In the 
final level, 2-level DFB is used with 4 directional frequency partitionings. Figure 6.10 
shows the decomposition of an x-ray image by using the NSCT transform at different 
scales and orientations. We refer to the response of the NSCT decompositions as the 


NSCT feature images. 
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Figure 6.10 NSCT image decomposition at different scales and orientations (NSCT Feature Images). 


- Second, for every characteristic salient point (P,; ), a descriptor of its local appearance 


is constructed by using the NSCT feature images. We use a square grid of NxN (N=7) 
points with the characteristic salient point at the center of the grid. A variance-based local 


descriptor is then extracted in the manner explained in section 6.3.2.2. 
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The Contourlet-based local descriptor for the characteristic salient point P, is then 
defined as 


fe) 1.) Senay OM (6.19) 


where E,, is the average absolute deviation of coefficients in a square grid for the ih 


NSCT sub-band, and D is the total number of NSCT sub-bands which is equal to 36. 


ii. Profile modeling using Gaussian Mixture model (GMM) 


For every characteristic salient point, the set of local descriptors extracted from the 


training set,{ F, } is treated as samples in R”. K is the size of the training set. A 


KaleKe 
mixture of four Gaussian distributions is then used to represent the underlying 
distribution of every set of descriptors. A sample profile for characteristic point (P,; ) can 


then be written as 


Nin 
pF, /0)=) a fF), (6.20) 
A= 


where 
N, : Number of Gaussian functions in the GMM. 


f,(F,) : Multivariate Gaussian density function. 


Nn 
a, : Mixture weights, where Die =1, 


n=l 
The parameters of the Gaussian mixture model,@={@,, oS. Ion = Ns ate 


estimated from the training set by using the EM algorithm, as explained in section 


Se Ree Ie 


By using the training set, a local appearance model is estimated for each one of the 


characteristic salient points. 


6.4.1.2 Construction of the multi-scale shape prior 


A set of training shapes are first aligned by using the Procrustes alignment method [38]. 


Let the n” shape vector with K landmarks be represented as 
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X, =i! . (6.21) 


where x and y are x-and y -coordinates. 


Next, we construct the multi-scale shape prior which can capture a wide range of 
permissible deformations in the lumbar spine. First, we use a dictionary of wavelet- 
packets to find an optimal representation for the shapes in the training set. The statistics 
of the sparse multi-scale wavelet packet representation are then modeled by using the 
Independent Component Analysis (ICA). More details regarding the construction and 


evaluation of the multi-scale shape prior were presented in chapter four. 


Any shape, X , can be described by using the multi-scale shape model as follows: 
ai eal ea col 
X= X+(Qy) [1+ Qjc,@] eo 


Q,, =arg min, ,(M (p' X)). (6.23) 
where 


‘X :Mean shape in the training set. 


@ ip : Matrix of joint best basis for optimal representation of the shapes. Dy iS 


learnt from a dictionary of wavelet-packet basis by using an entropy-based cost 


function. 


@ ic, : Matrix of statistically independent basis functions. 


@ _ : Model parameters 


6.4.1.3 Segmentation and model search procedure 

The local appearance profiles and the multi-scale shape prior are then used to drive the 
segmentation process. Figure 6.11 summarizes the model search procedure. Given an 
input test image, the segmentation of the lumbar spine is accomplished by using three 
main steps: (i) salient point detection. (ii) matching salient points to local appearance 


profiles, and (iii) using the multi-scale shape prior to drive the segmentation process. 
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Figure 6.11 Model search procedure using Contourlet-based local appearance profiles and multi-scale 
shape prior. 


The details are given in the following steps: 

Initialization: 

The search is initialized with an estimate of the approximate position of the object of 
interest in the image. Shape parameters are set to zero, which corresponds to the mean 


shape. 
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(a) Salient point detection: A multi-scale directional decomposition is achieved by 
using the Non Sub-sampled Contourlet Transform (NSCT). We use the method described 


in section 6.3 to extract a set of robust salient points. 


Figure 6.12 Contourlet-based salient point detection. (a) Input image, (b) strength image, (c) input 
image with extracted salient points. 


(b) Salient point matching: Next, we use the set of local appearance profiles learnt 
during the training phase to locate the characteristic points outling the bodies of the 
vertebrae in the test image. During the search, we use the current pose estimate of the 
lumbar spine to restrict the search space for each characteristic point. The optimal 
position for any characteristic point is defined as a salient point that has a local 


contourlet-descriptor with the maximum probability as determined by the mixture 


distribution. 


Each characteristic point rl cay has a local appearance profile modelled by using a 


Gaussian Mixture Model with the parameters 0; = {@,, LZ, Dy beat: 
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Let W, denote the extent of the local search space for Eee Leases phen Dera 


subset of the salient points that belongs to the search space (W.) of the current 


characteristic point. 
The optimal position for the characteristic point ( P.; ) is the salient point (s, ) whose 


local texture descriptor ( e, ) has the maximum probability as determined by the mixture 


distribution: 


; Nn 
Ss; =argmax,,, p(F, /8;) =arg max,, , Dinh E, Nigtes ine (6.24) 
n=l 


(c) Shape-constrained Thin Plate Spline (TPS) warping: The outcome of step (b) is a 


set of selected salient points ({ 7, }) that corresponds to the set of characteristic points 


({ M, }, i=1,...,L ) that characterizes the reference mean shape ( xX ) 


We use the two corresponding sets of points to estimate a smooth mapping 


transformation used to map the mean shape, ( X,, ), to the target object in the test image. 


By doing so, we obtain an initial segmentation of the target object. 


We find a smooth transformation, f , that satisfies the following interpolation conditions 


at the selected salient points: 


T, = f(M,),i=1...0 , (6.25) 
where 


M, : is the i” characteristic point on the reference shape. 


T, : is the corresponding selected salient point in the target image. 


L_ : total number of characteristic points used to represent the object of interest. 


The Thin Plate Spline (TPS) is a famous transformation that can represent smooth, rigid, 


and non rigid deformations [136]. Given the set of corresponding points ({ M j fb }), 
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the TPS model satisfies the interpolation condition in equation 6.25, while minimizing 


the bending energy ( am ), which is the energy required to bend a thin plate spline 


fixed at given heights that correspond to the displacements at the points: 


Bo | | (fo) + 2h,” + fy Jdxdy (6.26) 
R2 


Since the obtained corresponding point set is prone to errors, we use the approximate TPS 
model, which allows for an approximate interpolation between the points in favour of a 


smoother transformation [137]: 


L 
i 2 
Exps ¥ vie i F(M;)] J AE sending . (6.27) 
i=l 
The TPS transformation function is then given by 


Frps = arg min ; E7ps. a 


This minimization problem can be solved by using set of linear equations [136]. The TPS 


; : pans if 
transformation is characterized by two parameter vectors a=la, a, a, | and 


[ ; on 
w= Lw, WwW, caw . These vectors determine the affine and non-rigid part of the 
transformation, respectively: 
L 
f(x, y)=a,+a,xtay+ > wU(T,-(%y)) 29) 
i=l 
where 


U(r) =r’ log(r) is the basis function. 
@- Parameters for the affine component of the transformation. 


W : Parameters for the non-rigid component of the transformation. 


{ I }: set of selected salient points in the target image. 
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Next, we obtain the segmentation estimate ( x TPs ) by warping the mean shape ( x M ) 


by using the obtained TPS transformation: 


Xaps = XyatU(X,,)w. (6.30) 


Figure 6.13 illustrates the estimate of the TPS warping function by using two sets of 
selected points. The dense mean shape is then warped by using the estimated warping 


function. 


TPS Warping 


Figure 6.13 Illustration of TPS warping procedure. (a) Two sets of points used to estimate the TPS 
parameters, (b) estimated TPS warp, (c) Warped output shape 


The TPS model tries to interpolate the selected salient points while keeping the 
transformation as smooth as possible. Two main factors limit the accuracy of the 
segmentation estimate obtained by using the TPS model alone. First, the selected salient 
points that the TPS model interpolates are prone to error due to the poor image quality. 
Second, no prior information (other than the smoothness) is imposed on the TPS 


transformation. Hence, it might result in non-plausible shapes. 
Thus, we use the multi-scale statistical shape model to constrain the shape deformations 


to a space of plausible shapes. In chapter four, we illustrated the ability of the new multi- 


scale shape prior to capture both localized and global deformations. This ability would 
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enable the segmentation algorithm to generalize well in cases of normal and pathological 
vertebrae. 


We apply the multi-scale statistical shape constraints through the following steps: 


- Align the shape X,,, with the mean shape with respect to rotation and scale. 


- Project the shape X,,, onto the space represented by the multi-scale Wavelet- 


ICA basis set, and obtain the model parameters { i }. 


- Calculate the new statistically constrained shape estimate : 


AO OMS = prion a= y Guens (ony Lit Die Ol © (6.31) 


A 


- Apply the inverse affine transformation to X ms-prior to obtain the segmentation 


outcome. 
(d) Repeat steps (b-c) until convergence. 


6.4.1.4 An alternative approach: Contourlet-based livewire best-path search 

This section presents a variation of the segmentation method proposed in section 6.4.1.3. 
Instead of using the Thin Plane Spline (TPS) transformation to find the initial estimate of 
the segmented body, we propose to use a Contourlet-enhanced livewire search procedure. 
The livewire algorithm is a real-time segmentation method that finds the best paths 
between point clicks provided by the user [138]. The proposed variation integrates an 
enhanced livewire best-path search step in the segmentation framework described in sec 
6.4.1.3. This integration is performed to benefit from the power of the salient point 
matching method in locating the object of interest, and also from the object delineation 
capabilities of the livewire algorithm. Given the set of selected salient points, the livewire 
method is used to find the optimal boundary that delineates the object of interest. The 


resulting shape is then regularized by using the multi-scale shape prior. 


The livewire algorithm is a feature-based image segmentation algorithm [138]. The 
image is viewed as a directed graph where pixel vertices are the nodes, and the oriented 
pixel edges represent the edges of the graph. The problem of finding the optimal 
boundary between any two selected points is then equivalent to finding the minimum of a 
cost function between the two points [138]. The cost function is a crucial part of the 


livewire method. We would like the cost function to include rich information about the 
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boundary of the object of interest. The cost function is usually a combination of different 
feature images that are expected to provide information about the localization and 
directionality of the boundaries to be segmented. Examples of the livewire feature images 
in the literature include the Laplacian of Gaussian Zero crossing [138], gradient 
information [139], and phase information [140]. In [141], Seetharaman et al. proposed a 


method for detecting osteophytes in the lumbar spine by using livewire segmentation. 


In this section, we propose a new Contourlet-based cost function to guide the livewire 
search process. Our argument is that the inherently 2D nature of the Non Subsampled 
Contourlet Transform (NSCT) will allow for feature images with rich geometric and 
texture information. Including these feature images in the cost function will enhance the 
ability of the liverwire algorithm to localize the boundaries of the objects of interest. We 
use the NSCT sub-bands to extract useful local image features, which are then included in 
the livewire cost function. The best path search is then performed by using the new 


Contourlet-based cost function. 


Local Image Features 


- Contourlet-based edge map (E, ): The first term of the cost function is an edge 


map based upon the NSCT decomposition. This term is computed by using 
NSCT decomposition and second moment matrix analysis. This edge-map was 
introduced in section 6.3 to extract salient points. It is capable of capturing 


smooth oriented contours and corners outlining the object of interest. 


- Directional Image Feature (F,  ): Directional information is also important in the 


livewire search algorithm. Proper directional features are needed to guide the 
search algorithm to take the correct local paths. Such guidance becomes 
important in cases involving noisy images and the presence of other attractive 


neighbouring structures, particularly when the human spine is involved. 


The NSCT decomposes the image into a number of scales and directions. For 
each direction, we calculate a scale-integrated response image by first 
normalizing each scale level to its maximum value and forming an accumulated 


image by using weighted averaging: 
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s=1 
The outcome is a set of directional feature images capturing directional 


information at different orientations {E,(x, y)},@=1,....,N,. Next, we estimate 
the dominant orientation (8, ), or the one with the maximum scale-integrated 


response. The directional image feature (Es, ) is then given by 


E, =max,{E,(x, y)}. (6.33) 


- Edge detection (E,): This is obtained by performing the Canny edge detector 


by using the Contourlet-based edge map. 


Feature Combination 
Before the feature images are combined, they are converted such that a low cost is 
assigned to the desired boundary features: 


E=\1—-E/max(£). es 


The new cost function is then defined as the weighted sum of all the terms described 


above: 


Eg = OE Oi). Be (6.35) 


cost 
Best-path search 
For the second part of the algorithm, we use the Dijksta’s algorithm to find the shortest 


path between the selected salient points in the target image [142]. 


6.4.2 Comments 
Section 6.4 closes with some comments regarding the proposed vertebrae segmentation 
method. The new segmentation method combines the power of model-based 
segmentation approaches with the flexibility of low-level feature-based detection 
methods. 

- The low-level image analysis component of the proposed method is based upon 


the Non Subsampled Contourlet (NSCT) analysis of the x-ray image. As 


146 


» shenites ow trol’ b= 8 ity a Bijons ai Deane 


“noqy beau i benetdanns bono sti t0 Frise: era ht: a: bree > 


(NEO) ve 5 a fey 
\ i : ‘ ¥: . # on & "a Bi y a A mena hy ‘ ‘ ey) 


‘a 


mad ir co 


Innere aiiriiqss . < gpor pias snr We rr" oe st 


botsainl-elese miumiveat oe shtive oth onl) 160, (9 NRCS 
iy, 


vel syne Wet en ped wine ay ice sygnOryes 


i; f. a Py. ; ie tie Hh a : 
| {CO ) ale xe sit = eS re sit 
iy “QF ! } t , 


| «tee 
cya nee i Bipbitiino? aly ai a | 


( woo wol ele) dou. bsieyaos ork woly ouittines 2s pagent ee, 
AT aiiad boriesb oats ont 


@ ne 9 a ai 4 


ed 


(26.0) An his + rv obi Spo se a | 
estrode oa} hot oe ee A asenievons bigs cin pebieints agi? Wy ‘rity wees i 
SE vp Spt alt wi ainlog irailne betoota = ots nase 


AG 


a] i) tas 15 


8 on oy 7 i 


“A - ox ailt Ye se amen Jalrmasiod be oh vo 
ey ena 


discussed in chapter three, Contourlets are suited for representing higher-order 
singularities such as edges and contours. In medical imaging, Contourlet-based 
analysis provides better representation for the edges and contours of the 
anatomical structure of interest. Thus, Contourlet-based analysis enables us to 
extract robust and stable salient points for the object of interest. We also build 
efficient local descriptors for the selected salient points. These descriptors benefit 
from the rich geometrical and textural information captured by the Contourlet 


sub-bands. 


The proposed method employs a model-to-image fitting strategy based upon a 
few selected characteristic points for the object of interest. Compared to the 
Active Shape models and the related work in literature, our method has the 
advantages of increased robustness to noise. Also, the proposed segmentation 
method uses local texture appearance profiles to detect the object of interest. In 
comparison with global texture models, our model allows for more freedom in 


detecting target objects. 


The proposed segmentation method uses a new multi-scale shape prior. As 
detailed in chapter four, this prior outperforms traditional PCA-based priors. It 
has better generalization ability and is capable of capturing a wide range of 


plausible local and global deformations. 
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6.5 Content-based medical image retrieval using Contourlet-based 


salient point detector 


This section presents a new method for content-based medical image retrieval in the 
Contourlet domain. The method integrates the new salient point detector presented in 


section 6.3 in a medical image retrieval framework. 


6.5.1 Background 

Over the past decade, researchers have been increasingly interested in content-based 
medical image retrieval (CBIR) methods [143]. CBIR systems aim at retrieving images 
from a database by using visual image features rather than textual information. With the 
growing number of digital medical images in hospitals, efficient CBIR techniques can 
benefit a wide range of medical practices. However, medical image retrieval is a 
challenging task due to the quality of medical images, the complexity of anatomical 


structures, and the wide range of intra-class variability. 


Traditionally, various global and local features have been used for image retrieval 
applications. Examples of global features include color information, histograms, and 
response from directional filters such as Gabor filters. In the field of medical imaging, 
local features appear to be more appealing than global features. Moreover, using local 
features in medical image retrieval opens the door for pathological-based retrieval 
systems. In [144], Deselears et al. presented a comprehensive review of the features 
employed in image retrieval systems. In [87], Antani et al. investigated using different 
shape features of the human vertebrae for the content-based image retrieval of spine x-ray 
images. In [145], Liu and Tong proposed a method for medical image retrieval using 
Gabor-based textural features extracted from salient points. In [146], Maree et al. 
proposed a method for image retrieval using randomly selected sub-windows. In [147], 
Tommaasi et al. suggested using SIFT features for medical retrieval applications. In 
[148], Howarth suggested using Gabor-based texture features and color information for 
medical image retrieval. In [149], Ngo suggested using DCT-based features. In [150], 
Selvarani et al. proposed a method for medical image retrieval using Gabor-based low- 


level features and high-level semantic features. 
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6.5.2 Methodology 

This section presents the details for the new content-based medical image retrieval 
system. We propose a method for content-based medical image retrieval using a NSCT- 
based salient detector and local descriptors. The NSCT-based salient point detector is 
used to extract salient points that locally characterize the object of interest. We use local 
descriptors to capture local textural information from the multi-scale multi-directional 
sub-bands of the Non Subsampled Contourlet domain. Figure 6.14 summarizes the 


proposed medical image retrieval system. 


Non-subsampled Contowlet 
Decomposition 


Non-subsampled Contourlet 
Decomposition 


Figure 6.14 Content-based Image retrieval using Non-subsampled contourlet features 


The following outlines the details of the proposed method: 
- We perform multi-scale multi-directional image decomposition by using a Non- 


subsampled Contourlet Transform (NSCT). We use 3 decomposition scales with 4- , 3-, 


and 2- directional levels. This corresponds to a total of 28 sub-bands. 
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- Next, we apply the salient point detector proposed in section 6.3 to extract robust and 


stable salient points. We extract a set of N salient points, 1Sha ,t=1,2,...N , from each 


image in the database. In our experiments, N was set to 100. 


- For each salient point (SP), we extract a local descriptor from the NSCT sub-bands. 


Let V. 


,, Tepresent the local descriptor extracted from the i” salient point in the j” 


image. Then, the feature vector (F,) that represents the j image is formed by 


concatenating the extracted local descriptors from all N salient points: 


Feevaye yl (6.36) 


- The final feature vector is generated through a dimension reduction step using Principal 


Component Analysis (PCA). Using a training set of M images, we extract the feature 


vectors, { F ? J =1,2,...M }. Then, we use the feature matrix to estimate the eigenvector 
basis matrix ( @pc, ). Any NSCT-based feature vector ( F; ) can be represented by the 


projections onto the PCA-basis matrix Qpcy : 


F =F + Ocal eam 


where @ represents the contributions of various Principal Components to the variance in 
the training dataset. Dimension reduction is then achieved by retaining only the PCA 


coefficients that capture up to 90 % of the variance in the dataset. Given the PCA basis 


matrix (pc, ) and the original feature vector of a new image ( F,,,,,), the final feature 


new 


vector is obtained by projecting onto the PCA basis functions that retain only 90 % of the 


variance: 


Garey = Ore Cem = F) ; (6.38) 


- The final step is the image retrieval. Given a query image, we use the Euclidean 
distance to measure the similarity between features extracted from the query image, and 


others in the database. The obtained distances are then sorted to obtain a ranked list of 


retrieved images: 
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Where ED, is the Euclidean distance between the query image / , and the j" image in 


the database. 
6.5.3 Comments 


This section highlights some of the important aspects of the proposed content-based 


image retrieval system. 
- The proposed framework is based upon local features extracted from the Non- 
Subsampled Contourlet domain. The NSCT-based image analysis allows for the 


extraction of rich textural and geometric information. 


- Using the NSCT-based salient point detector at the core of the retrieval system allowed 


for the extraction of robust and repeatable salient points suitable for medical images. 


- Due to the locality of the features extracted, the system has the potential to allow for 


more sophisticated retrieval tasks such as pathology-related retrieval. 


ifaw 


nikoanact OV wel Giuse ‘Sestet Vishp ors movwisd somsneliins 


: =, pee) pre 
.cdetnaing? heanqorp Sabie Booed inamoort) sd To Sepo8 ehipittigil 9 , 


/ 


=) ~ § 
~Om, 


erewl <n ont Gems eet SY JIeoo naqu beaged ‘ ( ocomat’ 
velin alevigae sgai. Garnet TOA ont nbariob sabre 
Es OF Gens Dott pase 


wot iter) oletiae etelog iia.) 5 aE GET bite ration a0 bf tontt 


1 wolle on nisalog a aad pigtays on bores a eons? Hort Wo ‘ie | 
fsvaiiet bein! aihaale’ gy fous thee tevorns: batwoinei oe He 


: yn 
‘ 
f Zon 
; ri i 
fo ‘ oe 
7 
\! 
‘hy 
’ 
{ 
\ 
\i . 
’ rh, 
ea ee 
f 4 ue 
| - 


6.6 Experiments and Results 


In this section, we evaluate the performance of the methods presented in this chapter. In 


particular, experiments were conducted to evaluate the following: 


Robustness of the NSCT-based salient point detector and local descriptor. 
Accuracy of the new vertebral segmentation algorithm. 


Accuracy of the new medical image retrieval method. 


6.6.1 Evaluation of the NSCT-based salient point detector 


Datasets: 


We evaluate the performance of the proposed NSCT-based salient point detector by using 


standard test images. Figure 6.15 shows examples of the images used in the experiments. 


Figure 6.15(a) shows a standard synthetic image that exhibits different boundaries. Figure 


6.15(b) is a VanGogh image, which is a famous image with many texture variations. 


Figure 6.15(c) shows an example of the lumbar spine x-ray images. These images are 


used to quantify the performance of the new detector in terms of the following: 


Localization capabilities. 
Robustness to geometric and photographic transformations. 


Robustness to different noise levels. 


Efficiency in medical image analysis. 


Figure 6.15 Images used for evaluation of the NSCT-based salient point detector. (a) Synthetic image, 


(b) VanGogh Image, (c) X-ray image of lumbar spine 
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Evaluation Criterion: Repeatability Rate 
We evaluate the robustness of the proposed salient point detector by using the 


Repeatability Rate criterion (rr ) introduced by Schimd et al. [126]. 


Given two images related together by some transformation, the repeatability rate is the 
ratio between the number of corresponding points detected in both images to the total 
number of points in the common area. In our experiments, we consider the following 
relationships between the reference and test images: 

- Rotation transformation. 

- Thin Plate Spline transformation (as an example of non-rigid transformation). 

- Variations in illumination levels. 


- Variations in noise levels. 


Pee ex ae i 4 Sud be the two sets of interest points 
detected in the reference and test images, respectively. Let N ~ be the number of salient 


points in the common area. Let C { X Me x aN } be the number of corresponding 


pairs of salient points with a localization error of 1.5 pixels. The Repeatability rate (rr ) 


between the two images is defined by 


“ el Xo"? ae 
~~ yoo ‘ 


(@ 


rr (6.40) 


We evaluate the proposed salient point detection method (NSCT-based) against the 
following famous salient point detectors from the literature: 
1. The Harris detector (Harris). 
2. A detector proposed by Kovesi et al., based upon Log-Gabor decomposition and 
phase congruency (PC-based) [134]. 
3. A variation of the proposed salient point detector based upon steerable pyramid 


decomposition (Steerable-based). 
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6.6.1.1 Localization 

The localization capability of the new salient point detector is evaluated by using the 
synthetic test image shown in Figure 6.15(a). Figure 6.16 shows the salient points 
extracted by using various detectors. The proposed NSCT-based detector is capable of 
extracting most of the salient points in the image. The Harris detector and the Laplacian- 
of-Gaussian (LOG) detector seem to perform well only in case of strong salient regions 
(corners and edges). However, the two detectors show a limited localization capability in 
the cases involving weak edges and smooth contours. The phase congruency method (PC- 
based) demonstrates a better performance compared to that of the Harris and LOG 
detectors. The proposed NSCT-based detector is more capable of localizing the salient 
points along the smooth circular contour. This superior capability is attributed to the fact 
that the Non-subsampled Contourlet Transform (NSCT) permits a more flexible 
directional analysis of the image. Similar observations were obtained when the detectors 
were applied to a spine x-ray image. Among all the detectors investigated, the proposed 


detector seems to best detect anatomical structures with poor edges and smooth contours. 


(a) Harris detector (> ) Laplacian detector 


(c) Phase Congurency detector (4) NSCT-based detector 
(PC-based) 


Figure 6.16 Localization of various salient point detectors. (a) Harris detector, (b) Laplacian detector, 
(c) Phase Congruency detector (PC-based), and (d) NSCT-based detector. 
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(a) NSCT interest detector (b) Hamis interest detector (c) Phase Congruency interest detector 


Figure 6.17 Localization of various salient point detectors in medical images. (a) NSCT-based detector, 
(b) Harris detector, and (c) Phase Congruency detector. 


6.6.1.2 Robustness to rotation 

This section evaluates the performance of the proposed salient point detector in terms of 
its robustness to the rotation angle. Experiments were conducted by using a set of rotated 
VanGogh images. Rotations are obtained by rotating the image around its central axis. 
The image set can be obtained from [151]. The repeatability rate is calculated as a 
function of the rotating angle over the range of 10-180 degrees. Figure 6.19 shows the 
performance of the proposed interest point detector in comparison to a number of famous 
interest point detectors from the literature [126]. The Harris detector and the Cottier 
detector are based upon the auto-correlation matrix. The Heitger detector is based upon 
Gabor filters. In comparison to the other methods, the proposed NSCT-based detector has 


achieved the highest rates of repeatability over the entire range of angles. 
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Figure 6.18 VanGogh test image at various rotation angles. 
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Figure 6.19 Robustness to rotation of various salient point detectors 


6.6.1.3 Robustness to noise 

The robustness of the proposed salient point detector to noise is evaluated by using 
VanGogh test images with added Gaussian noise. Figure 6.20 shows examples of noisy 
VanGogh test images with different noise levels. Figure 6.21 shows the effect of 
variations in noise levels on the repeatability rate for the proposed NSCT-based detector 
and the Harris detector. At noise variance levels below 0.02, the new method has 


achieved an average improvement of 15.23 % over the Harris detector. The proposed 
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method demonstrates much more robustness at increased noise levels. At noise variance 
levels above 0.02, an average improvement in the repeatability rate of 70.40 % has been 


achieved. 


Figure 6.20 VanGogh test image with added Gaussian noise : ( a ) clean image, ( b ) 0.01 noise variance, 
(c ) 0.06 noise variance 
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Figure 6.21 Effect of variations in noise levels on the repeatability rate for two different salient point 
detectors 
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6.6.1.4 Robustness to change in illumination 
The robustness of the proposed method to a uniform change in illumination is evaluated 
by using the VanGogh test image with changing levels of brightness. The brightness 


levels are changed by changing the illumination factor (ff ). Figure 6.22 shows the 
VanGogh image with different brightness levels. A positive value of # increases the 
brightness level, and a negative value of B decreases the brightness level. Figure 6.23 
shows how the repeatability rate is affected by altering the brightness level of an image. 
Compared to the Harris detector, the proposed NSCT-based detector achieves an average 


improvement in repeatability rate of 48.89%. Our method achieves an improvement of 


13.124 % in the repeatability rate over the phase congruency detector (PC-based) [134]. 


Figure 6.22 VanGogh test image with various levels of brightness 
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Figure 6.23 Effect of brightness level on the repeatability rate for various salient point detectors 


6.6.1.5 Robustness to non-rigid transformations 

In medical images, anatomical structures undergo a wide range of non-rigid 
deformations. Hence, a good salient point detector for medical images needs to be robust 
to non-rigid deformations in the object of interest. Thus, we evaluate the robustness of the 
proposed salient point detector to non-rigid deformations in the lumbar spine. 
Experiments are conducted by using a set of x-ray images of the lumbar spine with 
different non-rigid deformations. The object of interest in every image pair is related by a 
non-rigid transform. The Thin Plate Spline model is used to estimate the non-rigid 
transformation. We calculate the repeatability rate as the percentage of correct matching 
with respect to the total number of common points between the two images. Table 6.1 
shows the resulting average repeatability rates obtained by using the proposed NSCT- 


based detector, Harris detector, PC-based detector, and the steerable-based detector. 


The proposed NSCT-based salient point detector achieves improvements of 64.22 % and 
18.01 % in the average repeatability rate in comparison to the Harris detector and the PC- 
based detector, respectively. The NSCT-based detector achieves an improvement of 


9.95% in comparison to the steerable-based detector. 


159 


| f yi 
a ir P 
a 
ti 
X» 
’ 3D vB) cit Tree e,, ” hes) i 
adit ile Geir? snet Dil irae a) mo Byint pagsenathys 1d Ye a 


| at. remats verve htt bad ~ senna 1a 

Live et Guilin | acyrhande Tebtinpiens ener te 
Kn od Weal 2g Le RT aS Taiskok! I hag sv ohl favo paar p 

tid APY Re yon £S nije orotate ‘ads wt enothardsb sistem 

torrente ell pe ydimandiah pit xin-sYooy oy orgerdh race, yersil 

ie tee) orl he Aaa Hwa, . ng saavbent wa atstibee ea) anor 

sat Sy oacied yao piPiearasig tie re at Taitotatant al, bigit-roct it 

hicrp-nor sul), sient At 3. itt, pe eupa oie yen Al ni? edt mvrotenand ‘bil 

dskeres. anes Wi 4B) aioe Sah a ane WMiceigongon My qustualas oW pe: 

> ali paral Owenidy nur aiveney npr doredienia Issn sd? OF eaqeet aw 

M44. betonar sii gi ited -) qo ety: iG Foe re riketjarinony: “i 9anvave gnitlvest ofl) awosla 


1ofogtsS Horeg.ulGg wel St Abi 10633 iy heen WyIIAT vneH aa hea 
; iS Ves 


bite of} S660 40 Zihomover Whe asranily i Mats dacoxp intoitas boxed jy! evi ict 
fo) bak ralaadely arrukl eas a ‘acy tvaeierro oti Shin yi iclepeaguT Sy nTIWE oll i ey 16. Bhs 
KoOMmeVeqMT tw aaveitoe Jor Sil) bege- TOC alt. wav ioagest ona bossd 
| | ae ee ' _apsaatab Beund-otctorsatte nate 21 noafmmmen a 2G | 


4 


~ 


as, eh 


OF rh ee A Yi ocean 


Figure 6.24 Two x-ray images where the object of interest is related by non-rigid deformations 


Table 6.1 Evaluation of robustness of different salient point detectors to non rigid 


deformations of the object of interest. 


Salient Point detector method Average repeatability rate 


6.6.1.6 Appearance-based salient point detector 


In section 6.3.2.2, we presented a method to incorporate prior local appearance 
knowledge of the salient points in the detection process. We evaluate the efficiency of the 
proposed method in terms of the average reduction ratio in the number of detected salient 
points. We compare the performance of the proposed NSCT-based salient point detector 
in two cases: with prior appearance knowledge and without prior appearance knowledge. 
We use both detectors to extract salient points from x-ray images of the lumbar spine. 
The targets of interest in the images are the bodies of the lumbar vertebrae. The use of the 
proposed appearance-based detector results in a significant reduction in the number of 
salient points by excluding salient points not relevant to the object of the interest. Figure 
6.25 shows examples of salient points extracted from an x-ray image by using the two 
methods. An average reduction ratio of 51.86 % is achieved by using the appearance- 


based salient point detector. 
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(a) (b) 
Figure 6.25 Example for salient point detection using NSCT-based method (a) with no prior 
appearance knowledge, (b) with prior appearance knowledge. 


6.6.2 Evaluation of contourlet-based local descriptor 

The proposed contourlet-based local descriptor appears at the core of the methods for 
vertebrae segmentation and image retrieval. The efficiency of the local descriptor greatly 
affects other higher level tasks. In this section, we evaluate the performance of the 
proposed local descriptor and assess its ability to capture localized textural and 


geometrical information. 


Evaluation criteria 
We use the “Matching score” as the evaluation criterion. Given two images that are 
related together by some transformation, the matching score is defined as the ratio 


between the number of correct matches to the total number of matches. 


Lett g ={X,}, and 6 ={X ,“} be two setsof points in the reference 
and target images, respectively. Each point (X,) is described by using the NSCT-based 
local descriptor (F,). Thus, we have two sets of local descriptors 


(AS? ={F,}., A“ ={F {9} ) that correspond to, sets of points. in the 


reference and the target images, respectively. A point in the reference image (X aa iS 
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said to match a point in the target image (X ‘ad, if the Euclidean distance between the 


two corresponding local descriptors is less than a threshold: 


(r) (t) 
|F vee 


, <Threshold . (6.42) 


Any match found is said to be correct if the two points (X, and X ;) Spatially correspond 


to each other within a localization error of 1.5 pixels. 


We compare the performance of the proposed NSCT-based local descriptor against two 
relevant descriptors from the literature: 
1. GaborJets descriptor [145]: This is a local descriptor constructed by using Gabor 
filters. 
2. Steerable-based descriptor [152,153]: A local descriptor based upon steerable 


pyramid decomposition. 


6.6.2.1 Robustness to noise 

We use the VanGogh test image with added levels of Gaussian noise to assess the 
performance of the proposed NSCT-based descriptor in the presence of noise. We 
calculate the matching scores among a number of VanGogh images that have different 
levels of noise. Table 6.2 shows the average matching score for the proposed descriptor, 
the steerable-based descriptor, and GaborJets. An average matching score of 96.05 % is 
achieved by using the proposed descriptor, versus an average matching score of 86.43 % 


and 87.74 % for the steerable-based and GaborJet descriptors, respectively. 


Table 6.2 Robustness of various local descriptors to noise 


Sterable-based 


6.6.2.2 Robustness to change of view 


We also evaluate the performance of the NSCT-based descriptor against a change in the 
viewing angle. We use a set of standard test images with varying viewing angles [151]. 
Figure 6.27 shows the matching score as a function of the change in viewing angle. On 


general, the matching score tends to decrease with the increase of the difference between 
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the viewing angles. Among the local descriptors investigated, the proposed NSCT-based 


descriptor seems to be least affected by a change in the viewing angle. 


—+— NSCT-based descriptor | 


—— Steerable filters 
——+— Gaborjet descnptor 


Mactching score % 


Image Index 


Figure 6.27 Robustness of various salient point detectors to change in viewing angle 
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6.6.3 Evaluation of segmentation algorithm 

The vertebrae segmentation experiments were performed by using a dataset of 100 x-ray 
images of a lumbar spine with outlined contours of the five lumbar vertebrae L1-L5. The 
X-ray images were provided by the National Library of Medicine [108]. Our results are 
presented in this section. The accuracy of the proposed vertebrae segmentation method is 
assessed by using a leave-40-out experiment, where 60 images are used in the training 
phase, and 40 images are used for testing. The segmentation accuracy is quantitatively 


evaluated in terms of the point-to-contour error. 


Figure 6.28 Examples of x-ray images of the lumbar spine used in evaluation of the segmentation 
algorithm 


Figure 6.29 compares the proposed segmentation method to the Active shape model 
(ASM). The first row shows examples where the ASM fails to accurately capture various 
deformations in the lumbar spine. The segmentation results in the second row reveal that 
the proposed method is able to segment a wider range of deformed vertebrae. This is 
attributed to two main reasons: (i) the effectiveness of the proposed Contourlet-based 


model-to-image fitting procedure, and (ii) the generalization ability of the multi-scale 


shape prior. 
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Figure 6.29 Segmentation results for three different images by using Active Shape model (ASM) (a,b,c), 
and the proposed segmentation method (d,e,f). 


Figure 6.30 illustrates the effect of using the new multi-scale prior in the proposed 
segmentation framework. This figure reveals that the new multi-scale shape prior is 


capable of capturing local deformations in the vertebral bodies. 
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Figure 6.30 Effect of using the proposed Multi-scale prior on segmentation accuracy (a) No prior is 
used, (b) localized multiscale prior is employed 


Table 6.3 summarizes the segmentation accuracy per vertebra for the proposed 
segmentation method. The combined average segmentation accuracy for all the lumbar 


vertebrae is 2.669 mm. 


Average pt-to-contour | Max pt-to-contour error 


error 


Table 6.3 Segmentation accuracy per vertebra for the proposed method 
1.3270 mm 2.7654 mm 
1.1510 mm 2.2063 mm 


1.2303 mm 2.2487 mm 
1.3898 mm 4.0208 mm 


— e= 


We also investigate the potential gains from using the proposed Contourlet-based 
livewire procedure to delineate the vertebral bodies. Figure 6.31 compares the 
delineation capabilities of the proposed Contourlet-based livewire method with those of 
the traditional livewire algorithm. The comparison reveais that the contourlet-based 


livewire exhibits better delineation capabilities, especially for weak edges and contours. 


166 


eras yy? 


Re ts proves. eat rinse 
a 


wt di SaSreey te, fet, eT oe ipoetongas ‘le 


ate ’ Pece Tn i - i | es st a ovr st oy : — oth 


May i rai , 
til rele aeull _ Pavey wee ae - 
i ip | YY 2 r 
rept ‘ , as vy as er 7. F | | 
- 
us f oe Y ee ea 
aqme ae) “ee corer ee snipe 
dnl a . ies au ; 7 a i J ? | ' vin 
* . a) } ’ , 
1 compat es eye apd aang eat ge ee ile abe Ge eq sranguirslly ikon renee be patie mt wet ene nf 


See ee ern ee er tions’ iid ee oe “ rei mene lee 
‘ t r ’ ; 


yaaa) Sif ,” es MN a tall ies ‘unl 


e: rx? ey 


ls 10) VOR Need eee by i ai oT diam 


opal be te _ paueyreat eehia, Sct yoy he 


eee 14 eye <a ‘ bea mae diy aia 
ra " Saray 


i 


erty iL ce 4 moe Pa Lamesstabtys teens Ne gS a \ hates 
ee | i hE ° < ; tt ii ca 
ba a , Per le | Jeg Piney ag ar ;TX thea 2 ab felon as 
hire # We f : at Ne ea i 


‘ i 


‘buadtiae 410 


pli? ULO0: | 


tp apg ttm et - ong & — - 


ie LIT. tikes A ) ho ny 


nance 9mm ey br hentai Teg een et ee Yd 
4 » | 5 , PY 


AP, 
4 


ash! “ate ANDO, de i eta ie ‘aioe {entlotr ray art) oN cisorvi heals oy mim 


ty spol uw hoi sin peta ) basen Set do? singe 


bse 


oat 


se " aii 


(c) {d) 


Ce) (f) 


Figure 6.31 Comparison for the delineation capabilities of the proposed contourlet-livewire method (b, 
d, f), and the traditional livewire algorithm (a, c, e). 


Discussion 

We presented a novel method for vertebrae segmentation using a shape-driven salient 
point matching procedure. The proposed method incorporates the following: (i) a novel 
Contourlet-based local appearance model used to detect vertebral boundaries in the test x- 
ray image, and (ii) a localized multi-scale shape prior used to drive the segmentation 


process. An average accuracy of 2.669 mm is achieved. 


Our results compare well with those of other vertebrae segmentation methods in the 
literature [36, 74, 56, 82, and 154]. In [154], Zomaro et al. proposed a hierarchal 
segmentation with an accuracy of 3mm for cervical vertebrae, and 6.4 mm for lumbar 


vertebrae. This method is fully automated since the Generalized Hough transform is used 
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to estimate the initial pose of the spine. In [56], Roberts and Cootes reported a 
segmentation accuracy of 1.28 mm by using a method based upon linking Active 
Appearance sub models. This method involves building multiple appearance models for 
segments of the spine to make up for the lack of localization in traditional appearance 
models. In [74], De Brujne et al. reported a mean point-to-contour error of 1.4 mm for 
lumbar spine segmentation using shape particle filtering. This method involves a time- 
consuming pixel classification stage. Thus, it is more suited for offline applications. In 
[82], Howe et al. reported a segmentation accuracy of 4.35 mm for lumbar spine 
segmentation using the Active Appearance model and the Generalized Hough transform 


to estimate the initial pose. 


6.6.4 Evaluation of the proposed medical image retrieval method 
This section presents our experiments to evaluate the performance of the proposed 


content-based medical image retrieval system. 


Dataset: 

We use images from the IRMA database (ImageClef2007) [125], which is publically 
available as a part of the Image Retrieval for Medical Applications (IRMA) project [125]. 
The database consists of radiograph images randomly collected from regular routines in 
the department of diagnostic Radiology, Aachen University of Technology (RWTH), 
Aechen, Germany. The images represent various views, pathologies, and anatomical 
structures. The complete database consists of 11,000 images with ground truth hierarchal 
classification. In our experiments, we use a set of 1000 images. 800 images were used in 
training, and 200 images were used as query test images. Figure 6.32 shows examples of 


the images used in our experiment. 
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Figure 6.32 Examples of the IRMA medical image database (Courtesy of TM Deserno, Dept. of 
Medical Information, RWTH, Aachen, Germany [125]) 


Evaluation Criteria 

We evaluate the performance of the new method by using the following criteria: 

- Average retrieval rate (ARR): For a query image, the retrieval rate is the ratio between 
the number of relevant images retrieved to the total number of retrieved images. The 
average retrieval rate (ARR) is calculated by averaging the retrieval rates over all the 


query images: 


Re IN ceva eerened (6.43) 
q 
retreived 
. 6.44) 
ARR = > RR,» (6. 
q=l 
where Nv vant—retreived « Number of relevant images retrieved, 
IN Eee : Total number of retrieved images, 
retreived 
RR : Retrieval rate for the g image. 


q 
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- Average Error Rate (ER): This is the average percentage of first retrieved images that 


are not relevant images. This rate is equivalent to the misclassification rate of 
neighbour classifier using the same distance measure. 


Results: 


Figure 6.33 shows examples of images of the same class where localized retriev 


methods are more efficient than global retrieval methods. Despite being from the same 


= 


) 


class, the three images exhibit a wide range of variations m quality. pose and 


deformations. 


Figure 6.33 Example of intra-class variations for images in the IRMA database 


Figure 6.34 evaluates the performance of the proposed CBIR system by using average 

retrieval rates (ARR) as a function of the number of top retmeved images. We compare 

the performance of the following systems: 

i. The proposed retrieval system using Contourlet-based salient pomt detectors 
(NSCT-local). 

ii. A retrieval system based upon Gabor-based salient detectors (Gabor-local). 

iil. A retrieval system based upon the global features extracted from the response of 


the NSCT sub-bands (NSCT-slobal). 


The proposed retrieval method (NSCT-local) tends to consistently outperform the other 
two methods. Over a range of 3-40 top retrieved images. the proposed method achieves 
an average improvement in the average retrieval rate (ARR) of 9574 %@ and 17.338 & 


over the Gabor-based system and the NSCT-global system. respectively. 
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Figure 6.34 Average retrieval rates versus number of retrieved images for the following CBIR 
methods: Gabor-based salient points (Gabor-local), NSCT-based salient points (NSCT-local), and 
NSCT-based global response (NSCT-global). 


Table 6.4 evaluates the performance of the proposed method in terms of the average 
error rate (AERR). The proposed method (NSCT-local) achieves an average error rate of 


21.43 % versus 35.71 % and 46.43 % for the Gabor-based system, and the NSCT-global 


system, respectively. 


Table 6.4 Average error rates for different CBIR methods 


Discussion 


In this section, we evaluated the performance of the new Content-based medical image 
retrieval system. The new system is based upon local features extracted from the Non- 
subsampled Contourlet domain. Using the NSCT-based salient point detector as the basis 


of our retrieval system allowed for the extraction of robust and repeatable salient points 
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suitable for medical image retrieval. The proposed retrieval method achieved an average 
retrieval rate of 65% for 20 retrieved images. This result compares well among with those 


in related work in the literature. 


In [144], Deselaers et al. evaluated the performance of various features for medical image 
retrieval. The results were reported by using the complete IRMA database. An average 
error rate of 27.7 % was reported by using SIFT-based local features. Gabor-based 
features achieved an average error rate of 37.9 %. In [145], Liu et al. used DCT-based 
salient points and Gabor features for content-based medical image retrieval. The authors 
used a database of 1000 medical images. The accuracy of the system was evaluated by 
using 50 query test images. An average retrieval rate of 54 % was achieved for 40 


retrieved images. 


In [155], Rao et al. proposed a content-based image retrieval system using the Contourlet 
transform. Global features were used for the retrieval task. Experiments were conducted 
by using a database of 750 face images. The authors reported an average retrieval rate of 
68.95 % for 15 retrieved images. Compared to the work of Rao et al. in [155], our method 
seems to be more suitable for medical applications, for the following reasons: 

- The proposed method uses the Non Subsampled Contourlet Transform (NSCT) 
for low-level image analysis. The NSCT is shift invariant, so it allows for the 
extraction of rich and discriminating information from different sub-bands. 

- The proposed method relies on salient points and local descriptors. In medical 
retrieval applications, localized features are more suitable than global features 
because of the variations in image quality and the appearance of anatomical 


structures. 
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6.7 Conclusions 


In this chapter, we presented new methods for two important medical imaging problems: 
the segmentation and retrieval of anatomical structures in digital radiographs. For image 
segmentation, we were particularly interested in the segmentation of vertebral structures 
in X-ray images. Our main focus was to develop an efficient segmentation method that 
can deal with both normal and pathological vertebrae. For image retrieval applications, 
we were interested in the retrieval of radiograph images with various normal and 


pathological anatomical structures. 


The following summarizes the contributions presented in this chapter: 


- First, at the core of the proposed segmentation and retrieval algorithms is a novel 
appearance-based salient point detection and matching method. The new method is based 
upon the Non Subsampled Contourlet Transform (NSCT), a truly 2D multi-scale and 
multi-directional image analysis tool. Contrary to the majority of salient point detectors in 
the literature, our detector integrates prior local appearance knowledge in the detection 
process. In comparison to related methods in the literature, our method has proven 
successful in detecting stable and relevant salient points in medical images. Rich and 
discriminating information was extracted by using local descriptors from the non- 


subsampled contourlet sub-bands. 


- Second, we presented a novel method for the segmentation of lumbar vertebrae by using 
shape-driven salient point matching in the Contourlet domain. The proposed method 
integrates low-level and high-level approaches. At the lower-level, the contourlet-based 
salient point detector is used to detect evidence of the object of interest in the test image. 
At the higher-levels, Contourlet-based local appearance profiles and multi-scale shape 
prior are used to drive the segmentation process. The performance of the proposed 


method was thoroughly evaluated in section 6.6. 


- Third, we presented a novel content-based medical image retrieval system using 
Contourlet-based salient point detectors. The proposed method retrieves images of similar 
classes by using salient points and Contourlet-based local descriptors. The proposed 


method was successful when applied to the IRMA database. 
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7. Simulation of Pathological Deformations 
Using Localized Modeling of Shape and 
Texture Variability 


7.1 Introduction 


This chapter presents a novel method for the sparse modeling of texture variability in 
medical images. The chapter also describes a general framework for the simulation of 
pathological deformations in x-ray images. Recently, there has been a growing interest in 
the concept of sparsity and its applications in the field of computer vision. In addition to 
the direct application of sparsity in signal compression, the role of sparsity in machine 
learning has lately emerged as a promising approach. Traditionally, texture appearance 
models are constructed by using a global PCA-based basis. A classical example is the 
Active Appearance Model (AAM) proposed by Cootes [37]. However, it is now accepted 


that most pathologies are often related to local variations in both texture and shape. 


The first part of this chapter introduces a novel sparse texture appearance model based 
upon Contourlet transform and Independent Component Analysis (CTICA-AM). A 
sparse texture representation is obtained by using the Contourlet transform. Contourlets 
are suitable for the representation of higher-order singularities in medical images because 
of the multi-scale, directionality, and anisotropy of Contourlets’ basis. Localized texture 
variations are captured by using ICA-based modeling in the compressed Contourlet 
domain. The modeling of raw pixel intensities becomes unfeasible as the resolution of the 
image increases. The proposed appearance model (CTICA-AM) benefits from the non- 
linear approximation power of Contourlets to achieve high data reduction rates, while 
preserving the accuracy of the statistical model. The new texture-enhanced appearance 
model outperforms other related wavelet-based models [62, 63]. This advantage is 


particularly important in modeling high-resolution images. 


In the second part of this chapter, we present a general framework for the simulation of 
photo-realistic spine x-ray images with various pathological deformations. Within the 
medical imaging community, simulation of images is particularly important due to the 
sacristy of ground truth data, which are needed in a number of tasks such as validating of 


segmentation and registration algorithms. Statistical models are always challenged by the 
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curse of dimensionality where the size of the available training set is far smaller than the 
dimension of the problem. The proposed synthesis framework integrates the new 
enhanced texture appearance model and the multi-scale shape model presented in chapter 


four. 


The rest of the chapter is organized as follows: Section 7.2 presents the details of the 
novel Contourlet enhanced texture appearance model. Section 7.3 describes a general 
image synthesis framework for spine x-ray images. Our experiments and results are 


presented in section 7.4. Finally, conclusions are drawn in section 7.5. 


7.2 Sparse texture appearance model using Contourlet-ICA 

This section presents a novel sparse texture appearance model based upon the Contourlet 
transform and Independent Component Analysis. The proposed model is capable of 
extracting spatially localized texture variations favouring sparsity. The model exploits 
the following concepts: (i) the ability of Contourlets to represent higher-order 
singularities, (11) the sparsifying power of Contourlets, and (iii) the locality of the ICA 
basis. Figure 7.1 summarizes the building blocks of the new Contourlet enhanced texture 


appearance model (CTICA-AM). 
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Figure 7.1 Block diagram for the Contourlet enhanced texture appearance model (CTICA- 
AM) 


7.2.1 Contourlet decomposition 
Transforms try to represent a signal as sparse as possible (Le. with few coefficients). 


Traditionally, sparsity has been exploited for compression applications. However. the role 


of sparsity in machine learning has received much inte t the past few years. The 
rationale is that statistical modeling m a sparse domaim can reduce the dimensionality of 


ng set. A number of 


the problem and achieve accurate results by using 


sforms can be used to produce a sparse representation of an image. Among various 


ae 


sforms, wavelets and wavelets packets have received much mterest over the past 
decades. However, it is now accepted that wavelets are optimal only m representing 1D 
signals. Wavelets fail to efficiently represent higher-order singularities due to wavelets’ 
inability to “see™ the smoothness along contours. Contourlets were mtroduced by Do. et 
al. as a geometrical second generation of wavelets that can better deal with higher-order 
singularities in images [99]. This advantage is attnbuted to the appealing properties of the 


Contourlet basis such as multi-scale, multi<directionality, and anisotropy. 
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We start with a training set of images, each with K corresponding landmark points 


marking the structure of interest. Let the n” shape vector be represented as: 


7.1 
fo Vee | ie cy 


where x and y are thex-and y -coordinates, respectively. 


Prior to texture modeling, we use the Generalized Procrustes method to normalize the 


shapes with respect to translation, rotation, and scaling. A reference shape (S, ) is taken 


to be the Procrustes mean shape. Next, a set of shape-free texture images is created by 


warping the convexhull of the landmarks in each training sample to the reference shape 
(S,,). This warping is done by using the piece-wise affine warping method [39]. This step 


establishes the correspondence among the images in the training set and allows for 
modelling texture variability. We use two sets of images to evaluate the proposed texture 
appearance model. The first set of images is a subset of the IMM face database created by 
the department of Informatics and Mathematical Modeling at the Technical University of 
Denmark [156]. The second set of images is a set of spine x-ray images obtained from the 
National Library of Medicine database [108]. Figure 7.2 shows examples of the face and 


spine images used in our experiments. 


Figure 7.2 (a) Face image with landmarks, (b) triangulated reference face shape, 
(c)Spine image with landmarks, (d) triangulated reference spine shape. 
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Given a training set of shape-free texture images, we perform a multi-scale multi- 
directional image decomposition by using the Contourlet transform. We use a 4-level 


Contourlet transform with 0, 4, 8, and 16 directions, respectively. 


Let {g; eer be the set of vectors of pixel intensities sampled at L locations for the 


texture images in the training set: 


£,=[£., ae a (7.2) 


{t, beg m_1is the set of vectors of concatenated Contourlet coefficients resulting from the 


decomposition of the shape-free texture images in the training set: 
t, =C,[g,]. (7.3) 


where C,,[.] is the contourlet transform operator. M is the size of the training set. 


7.2.2 Non-linear approximation 

We exploit the approximation power of Contourlets to tackle the problem of modeling 
high-resolution images. We use this power to achieve high reduction rates for the raw 
texture images prior to statistical modeling. Statistical modelling is then performed on the 


compressed Contourlet texture space. 


The non-linear approximation power of a transform is evaluated in terms of its partial 
reconstruction error. The m-term approximation of an image is obtained by 
reconstructing the transformed image by using the highest m coefficients in the 


transformed domain. According to [39], Contourlets exhibits optimal behaviour in 
: : Mee 2 
approximating smooth objects with discontinuities along the C“ curve. The mean square 


bay -1 
reconstruction error (MSE) for m-term approximation grows with 7 by using the 


; ; Ee? 
wavelet basis. For the Contourlet basis, the reconstruction error grows with log(m )m~ : 


foes 


am, (ae hae (7.4) 


? 


|t-ts 


a log(m*?)m~*, m— oo (7.5) 
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where t”", and t” are the m -term approximation using the wavelet and contourlet basis, 


respectively. 


For a training set of P images, we have a set of vectors of concatenated Contourlet 


coefficients {t, boy p- We select a subset of the Contourlet coefficients by choosing the 
m coefficients that best approximate the texture vectors, while preserving the per- 


coefficient variance over the training set. Let C denote a modified identity matrix 


reflecting the indices of the coefficients to be retained. The compressed Contourlet-based 


representation of images in the training set is then given by {(t),},i=1..P., where 


i 


(Ei = CU. oe) 


7.2.3 ICA modelling in compressed Contourlet domain 

Next, we use Independent Component Analysis (ICA) to model the texture variability in 
the compressed Contourlet domain. Pathologies are typically localized in both shape and 
texture. We propose to use ICA to extract spatially localized texture variations favouring 


sparsity. 


Independent Component Analysis (ICA) is viewed as a generalization of Principal 
Component Analysis (PCA). On the one hand, PCA can learn only second-order 
dependencies among the image pixels. On the other hand, ICA can also learn higher- 
order dependencies. In this work, we use ICA to extract localized modes of texture 
variations in the compressed Contourlet domain. We use the FastICA algorithm to find 
the independent sources of variations in the training set. The FastICA algorithm is an 
efficient method that finds independent components by maximizing non-Gaussianity in 


the data matrix [117]. 


The ICA model is defined as follows: 
NiarAS 
r= WX (7.8) 


179 


ath PK 


; fore tea i smo 


Pc eer Ty 


ahead 


hee.) Raleneimaen (> Annee Ty Ihe 's "9 
si) vaiknoone yd easitttads shag? “ls iat 
“ql aniviseig aduby aoa Vnlnw, yh oth : 


cree wie) babiboie # ane Mt oe yen NeNURCSINN 


os es OE a we Ne Db Sie war bul 


t 


ee | OEY, a 


iesihasaletonieniiee caitisielly 
1h RLLIC ee Sat oey en or (AM) ma Pius A, wrsseoqenty See: 
bofir oquivda nd. nt botitesel ellesiene! ow asi gebaelastl sine srw be 
sentra ranltge py “ihe retort ebheracage Dorey ha ma ¢ 


wye 


Init Wh tows ute inti & an tiie ab we siciank: ikioganed | 
ietiio-bligane, ima ntl! ee sored ann Belt ny A) wreinn A. 4 
“papaya minal vali np a Thea t ta ao pel a wes oe esis 
oT) to abo nA peg wh ae att Ww io eid * a9 aes 
bill (3), “rtubiiyorie ai vg pu il 
ths bmi {iaoat Aine ene i bi nt anoitahey 3) veryttine 3 
ini ein sisal Os moon > ape inky abst wet oti 
Caan at, ilvdia heat a 
| awl ek ae sated ti tte | 


‘th ay 


OO << ; ; ¥ ae! is P 
cua) ee ot ae 
Vig eh | i il Pe ie et de bl " * : nae 7 a | 
‘ : cy a) ; 
“A A PF ; i : : Ls mt 
uly =< 
oh me f” t 


where X is a matrix of observed mixed signals, and § is a matrix of statistically 
independent source signals. A and Ware the mixing and de-mixing matrices, 


respectively. Y is a matrix with the estimated source signals. 


The rows of the X matrix contain the vector representation of the texture images in the 
training set. The rows of the Y matrix contain the estimated statistically independent 


basis images. A visualization of the ICA texture appearance model is given in Figure 7.3. 


Rather than performing the ICA modeling in the pixel-intensity space, we perform the 
ICA modeling in the compressed Contourlet domain. Applying ICA to the selected 
contourlet coefficients has the following advantages. First, doing so improves the 
performance of the ICA algorithm since projecting the data matrix onto a sparse domain 
increases the independence of the sources. Second, doing so increases the robustness to 
noise and missing information by removing the noisy components through the Contourlet 


approximation step. 


Let C,[] be the linear contourlet transform; then the ICA model can be directly re- 


written in the contourlet domain as 


C,[X]= AC,[S] (7.9) 
C,[Y]=WC,[X] . (7.10) 


The Contourlet-ICA basis matrix is then defined as ®.,_;¢, =C,[Y]. C,[Y] is a matrix 


whose rows contain the statistically independent basis in the contourlet domain. 
The proposed model can be used to generate new texture images. We use the Contourlet- 


ICA model to predict selected Contourlet coefficients. We insert average coefficient 


values at the positions of all truncated coefficients. 
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Figure 7.3 Enhanced texture appearance modeling using ICA basis in Contourlet domain 


7.3 Image synthesis framework 

This section presents the details for a novel image synthesis framework to generate 
simulated spine x-ray images with various pathological deformations. The synthesis of 
medical images is important for validating segmentation and registration algorithms. Our 
goal is to create a simulation framework capable of imposing certain pathology-related 
deformations such as disc narrowing, and vertebral fractures. We focus mainly on the 
synthesis of spine x-ray images. However, the proposed framework can also be used to 


synthesize images of other anatomical structures. 


The new framework integrates the shape and texture generative models proposed earlier 


in this thesis, along with an example-based pose model. 


The following presents the details of the proposed framework: 


A grey level image I is viewed as a mapping from R? space to the range of possible 


gray levels: 
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I: R* >[0,255]. (7.11) 


tet tas a, HL =1..N beaset of N training images. J; denotes the input image. S, is a 
vector of the landmarks outlining the contour of the anatomical structure of interest. 


Let {J es S e } denote the reference image and the corresponding reference contour. 


Let A, denote the affine transformation aligning the reference shape (S,) to the 
shape S,. This transformation is found by using Procrustes Alignment [38]. 


Let W(.) denote the warping operator. The warping function is estimated by using the 


piece-wise affine warping method [39], which is efficient and allows for inverse warping. 


Animage / is characterized by the following three components: 
- The texture component, which determines the appearance of the structure of 
interest. 
- The shape component, which outlines the contour of the structure of interest. 


- The pose of the structure of interest in the image. 


Using the training set of images {I i? S; },7=1..N , we learn the variability in the pose, 
shape, and texture spaces. In each of these spaces, the image is viewed as a point whose 


coordinates constitute a parameter vector. Let @p ,@,, and @, represent the parameter 


vectors of the pose, shape, and texture, respectively. 


We learn the shape space by using the multi-scale shape model presented in chapter four. 
The generative Contourlet enhanced appearance model (CTICA-AM) is used to model 
the texture space. An example-based pose model is used to represent variations in 


translation and scaling within the input data set. 


e Multi-scale shape model 
We model variations in the shape of the anatomical structure by using the multi-shape 


model presented in chapter four. Using wavelet-packets and Independent Component 
Analysis, we extract multi-scale modes of deformations. Due to the locality of the 


Wavelet-ICA basis functions, the model can be used to selectively impose various 
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pathology-related deformations such as disc-narrowing, osteophytes, and vertebral 


fractures. 


We start with a training set of landmarks outlining the structure of interest in the images. 
Let Dyjc, be the matrix of Wavelet-ICA basis set described in chapter four. The multi- 
scale generative shape model can then be written as follows: 


S=S8 + O.Dyics. (7.12) 


A new shape (S ) is represented as a reference shape (S,) and a weighted sum of the 


wavelet-ICA multi-scale basis functions. The weighting vector (@, ) is used to impose a 


certain localized deformation. The multi-scale shape model also includes an inverse 
transformation step from the wavelet-packet domain to the spatial domain. This step is 


eliminated from equation 7.12 for clarity. 


e Texture appearance model 


We use the Contourlet enhanced texture model presented in section 7.2 to model 
localized texture variations. We start with a training set of spine x-ray images. Each 
image has a number of landmarks outlining the contour of the lumbar spine. We also add 


a number of border landmarks such that the whole image is encoded into the texture 
model. A set of normalized shape-free texture images, {7;,},i=1...N , is obtained by 


warping the images to the reference shape using the piece-wise affine warp. This 
procedure brings the pixels into correspondence. Next, we use the enhanced appearance 


model described in section 7.2 to extract a set of Contourlet-ICA basis functions 


(Denca)- 


A new texture vector (7 ) is generated by the summation of the reference texture vector 


(T, ) and the weighted sum of the Contourlet-ICA basis functions (Dore ): 
TOD (7.13) 


> 


where @., is the weighting vector used to generate a new texture image. 
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e Example-based pose model 


We generate new poses of the spine by using an example-based pose model. Given a 
training set of lumbar spine images with various poses CL i? S, },1=L.N ), we select a 


number of proto-type poses, which are used to represent the pose space. We interpolate 


these proto-type pose examples to generate a new pose of the spine. Let { See =1..M } 


be the set of proto-type poses of the reference shape obtained by affine warping the 


reference shape to the selected pose examples. S, and S, have the same pose: 


Sa AS. (7.14) 


We use interpolation techniques to synthesize a new pose within the M-1 dimensional 


pose space. A new pose for the reference shape (S, ,) 1s defined as a weighted sum of 


the selected proto-type poses: 


Smee 8. (7.15) 


Figure 7.4 shows an example of a two-dimensional pose space using four proto-type 


poses {S$ sy =1,2,3,4 }. We use bilinear interpolation to synthesize a new pose for the 


reference shape Ce ) at the (x, y ) position of the pose space: 


4 = 
ee = I) S, y (7.16) 
i=] 


where: 
a, (1) =(1—x)(1— y),a, (2) =(a)—-y), 


g.,(3) =(1—x)(y), a, (4) = (ay), and x, y € [0,1]. 
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Figure 7.4 Example of two-dimensional pose space using four proto-type poses 


Figure 7.5 shows the building blocks for the proposed image synthesis framework. An x- 


ray image with a certain imposed pathology can be synthesized by using the following 


procedure: 

1. Starting with the reference shape (S,), we use the basis of the multi-scale 
shape model to impose the desired pathological deformation. The outcome is 
a new shape (S_,) (see Equation 7.12). 

il. A new texture appearance (7° ) is then generated by using the basis set of the 
Contourlet enhanced appearance model (®_.,,., ) and the texture parameter 
vector (@,, ) (see Equation 7.13). 

iil. We use the pose parameter vector ( @,) to generate a new pose for the 
reference shape (S,,). We then find the transformation matrix (A, ) that 
maps S, toS,,. 

IV. The new shape (S,) is then warped to the new position by using the 


transformation matrix (A ): 
SARS (7.16) 


S',,p tepresents the generated shape with the new pose. 
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Via A new image is finally synthesized by rendering the synthesized shape-free 


texture to the new shape vector S, , : 


i bape emt (7.17) 


where W(.) is the operator for the warping function. 


Example-based pose model 
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Figure 7.5 Block diagram of proposed spine x-ray image synthesis framework 
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7.4 Experiments and Results 


This section presents the details for the experiments conducted to evaluate the methods 
presented in this chapter. We use two datasets of images. The first one is a dataset of 30 
X-ray images of the lumbar spine, with their corresponding ground truth segmentations 
[108]. The second one is a dataset of 40 images of faces for different individuals with 


various expressions [156]. 


7.4.1 Evaluation of the Contourlet enhanced texture appearance model 

In this section, we evaluate the performance of the proposed Contourlet enhanced texture 
appearance model. 

Approximation power 

We assess the approximation power of the proposed model for the x-ray images and the 
facial images. Our results are compared to those from a wavelet-based texture appearance 


model proposed by Stegman et al. in [62, 63]. 


For a desired reduction ratio (RR), we calculate the corresponding number of coefficients 
to keep (M). Next, we obtain an M-term approximation of the image in both the 
contourlet and wavelet domains. We compare the quality of the reconstructed signal by 


using the Signal-to-Noise Ratio (SNR). This step is repeated for various reduction ratios: 


> iC. 


PSG Lt: Sam | Sa bah aha ch (7.18) 
> Uday ly (a9) 


SNR = 10log,, 


Where /J(x, y)the original is image and J,,(x, y) is the image reconstructed using M 


coefficients in the Contourlet domain. 

Table 7.1 shows the SNR for the contourlet-based representation of the spine x-ray 
images versus the wavelet-based representation. In general, Contourlets tend to achieve a 
higher SNR, especially at higher reduction ratios. At a reduction ratio of 1:40, the 
Contourlet-based representation shows an increase of 11.9 % in the SNR. As the 
reduction ratio increases to 1:200, the contourlet-based representation shows an increase 
in the SNR of 86.3 %. This finding agrees with the results obtained from the visual 


inspection of the reconstructed images in Figure 7.6. The wavelet-based representation 
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deteriorates rapidly at higher reduction rates. At a reduction ratio of 1:200, the 


Contourlet-based representation appears to better capture the details of the image. 


Table 7.1 Reconstruction of spine x-ray images at different reduction ratios for the 


Contourlet-based and Wavelet-based appearance models. 


SNK for Contourlet- 10.66 db 12.91 db 13.04 db | 14.85 db 
5.72 db 7.41 db 10.16 db | 13.26 db 


based model 


SNR for Wavelet-based 


model 


Figure 7.6 Non-linear approximation of a spine x-ray image using Contourlets (top row), and Wavelets 
(bottom row). Reduction ratios: (a,e) 1: 200(b,f) 1:125(c,g) 1:100,(d,h) 1:40. 


Similar results were obtained when our proposed model was applied to the face images. 


results are shown in Table 7.2 and Figure 7.7. 


Table 7.2 Reconstruction of facial images at different reduction ratios for the 


Contourlet-based and Wavelet-based appearance models. 


SNR for Contourlet- | 6.87 db 8.0 db 8.86 db 11.73 db 
4.97 db | 5.31 db 5.55 db 9.96 db 


based model 


SNR for Wavelet- 


based model 
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Figure 7.7 Non linear approximation of facial image by using Contourlets (Top row), and (Wavelets). 
Reduction ratios: (a,e) 1: 200(b,f) 1:125(c,g) 1:100,(d,h) 1:40 


Locality of Contourlet-ICA basis 

Figure 7.8 compares the basis sets of the proposed appearance model (CTICA-AM) and 
the PCA-based basis set used in the Active Appearance Model (AAM). The proposed 
model produces more localized basis functions than the global PCA basis. This property 
is further exploited to impose localized texture variations. Figure 7.9 shows examples of 
facial images generated by using the new Contourlet-ICA texture appearance model. 
Thanks to the locality of the basis functions, we are able to selectively impose various 


localized deformations. 
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Figure 7.8 Examples of basis sets for different texture appearance models. (a) PCA-based Active 
Appearance Model, (b) Contourlet enhanced appearance model. 
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Figure 7.9 Examples of localized texture variations obtained by using the new sparse texture model. 


19] 


Se a Se ee ce ca Sr — = he = 


BT TAS: 
Lae 


ee 


a a ee 


‘ : % 
ine y or. ‘ud de t : 
spt A M0 ma peel rete Nigh ma Smt an te Mi ihe : rs 


delona Saphes) wadeee wrouy yf) weg pe henry ayitebwey quate Lewis 3 im , | 


Texture Reconstruction 

We study the ability of the proposed appearance model to reconstruct texture 
appearance examples by using a limited-sized training set. We conduct the leave-one- 
out experiment where by the appearance model is trained by using N-1 training 
images, and the left-over image is used for testing. This procedure is repeated for all 
the N images in the training set. The average reconstruction error (also referred to as 
the generalization error) is then calculated as the mean intensity error per pixel (for a 
256 gray level image). This procedure is repeated for N= [10 20 30 40]. Figure 7.10 
compares the performance of the proposed appearance model (CTICA-AM) to that of 
the Active Appearance Model (AAM). The new texture appearance model is shown 
to have a lower reconstruction error for training sets of different sizes. On average, 
the proposed Contourlet-ICA appearance model achieves a decrease of 17.62 % in 


the texture reconstruction error compared to the AAM model. 


average reconstruction error (per pixel) 


10 20 30 40 
Size of training set 


Figure 7.10 Reconstruction error of the new texture appearance model for different sizes of the 
training sets. 
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7.4.2 Evaluation of the image synthesis framework 

This section evaluates the performance of the complete image synthesis framework. We 
use a training set of spine x-ray images and the corresponding ground truth segmentation. 
Figure 7.11 shows examples of synthesized x-ray images where various localized 
pathology-related shape deformations were imposed. Figure 7.11(h) shows example of 
disc space narrowing between the L1 and L2 vertebrae. Figure 7.11(b,c) show signs of 
osteophyte pathology developed for the L2 vertebra. Figure 7.11(f) shows a fracture that 
was imposed on the L4 vertebra. Figure 7.12 shows examples of synthesized x-ray 
images with localized variations in texture appearance. The shape and pose parameters 


are kept constant. 


Figure 7.11 Examples of simulated x-ray spine images with various localized shape deformations being 
imposed (The texture and pose parameters are kept constant). 
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Figure 7.12 Examples of simulated x-ray spine images with various texture variations being imposed 
(The pose and shape parameters are kept constant). 


By varying the model parameters (@,,@,,@,), we can simulate photo-realistic x-ray 


images with different degrees of pose, shape and texture variations. The outcome is a set 
of synthesized images and the corresponding ground truth segmentations. Figure 7.13 
shows examples of simulated x-ray images and their ground-truth segmentations. By 
using the proposed framework, one can create an enlarged training set with specific shape 


and texture deformations. 
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Figure 7.13 Examples of simulated spine x-ray images and the corresponding segmentations. 


7.5 Conclusions 
This chapter presented new methods for the simulation of pathological deformations in x- 


ray images by using localized models of shape and texture. 


First, we presented a novel sparse texture appearance model based upon the Contourlet 
transform and Independent Component Analysis (ICA). The proposed appearance model 
benefits from the non-linear approximation power of Contourlets to achieve high data 
reduction rates, while preserving the accuracy of the statistical model. Our experimental 
results demonstrated significant performance gains for the proposed model in comparison 
to related work in the literature, particularly for high reduction rates. Acceptable results 
are obtained with reduction rates as high as 1:200. Such results are important in modeling 
high-resolution images where modeling of raw pixel intensities becomes unfeasible due 
to storage and computational requirements. The model is also capable of capturing 


localized texture variations through ICA modeling in the compressed Contourlet domain. 
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Second, we attempted to tackle the challenges related to the sacristy of ground truth data 
in medical imaging. We presented a general framework for the synthesis of photo- 
realistic spine images with various pathological deformations. The proposed framework 
integrates the shape and texture models developed in this thesis. We demonstrated the 
ability of the framework to generate x-ray images with various pathology-related shape 
and texture variations. Throughout this chapter, we focused mainly on the simulation of 
images of the human lumbar spine. However, the same framework can be used to 


simulate images of other anatomical structures. 


The new synthesis framework compares well with related methods in the literature. In 
[66, 58], the authors used image morphing for image synthesis applications. New images 
are generated as linear combinations of proto-type examples. On the other hand, the 
proposed framework generates a new image by using multi-scale statistical models of 
shape and texture. This framework has the following advantages: (i) better removal of 
linear dependencies among the proto-type examples, and (11) better extraction of complex 
variation modes. In [58, 64, 67, and 68], various PCA-based image synthesis methods 
were presented. In comparison to the PCA-based image synthesis approaches, the 
proposed framework can capture a wider range of localized deformations for the 
anatomical structure of interest. Using the new framework, we can synthesize images 
with specific pathology-related shape and texture variations. The proposed framework 


can be used to validate segmentation and registration algorithms. 


196 


nial plilotacaal4 n ng be x 
to. ott eure alt fo inka ins + Ht varerta ay 1 soba ; 
oF gage on taly wove / OnE ‘see ait evo yh 7 ‘od sai vt 


= 


er eh ; ’ : f , a Lie 

al oyuietett odin d6ediudn feapisPvitiv: how MTC froma vnetg wi 

a anit Wet AOU eos TET (ae ya grin: {iON oy nent oan zrodla st 
if, 

si baat aodio: of OQ) ctalegengale bg y) tis tG 2 tn asta teat wenn a 


lo etaborn gsierti sateen gee Bi) a Wi & @038%: sons ows t 


Sharnce 1G Mion EES we fay) Wiss porans orien 3 orion - oe 
vhonignl elaamings spon, Beam ey, Mi xdohiev 130) bie ALS Pe S27 al abo , 
aif) noibynougegls abu age apie) bayed- AO ods on moaiwcios iL beeroaiy 
set yt subikandtst howissol {i Maga fade wibiw a auatgs: ‘ey " pci 
anuctn oteodieiv, an yi “hveonadegenectt iNet sult arial) eereini w alli a 3 
oy sist yoeanom sa7 ammisuryaw ott: at ben Seqnabs boielanagetsding 

are ipo nee 41 tina earn shi biter a ses d 


8 Summary and Future Work 


8.1 Summary 


This thesis dealt with a number of challenges related to medical imaging and model- 
based computer vision. We presented novel statistical models of shape and texture to 
improve the support of various medical practices. Within a biologically-inspired 
framework, we employed these models in two spine-related tasks: the segmentation and 
quantification of vertebral structures in medical images. 


The following summarizes the contributions of this thesis. 


Multi-scale shape modelling 

In chapter four, we attempted to overcome some of the challenges associated with 
Statistical modelling in high-dimensional space. We presented a novel multi-scale 
statistical shape model using the concepts of sparsity and statistical independence to 
extract multi-scale localized modes of deformations. Within a best-basis selection 
framework, the proposed method benefits from the multi-scale and sparsifying nature of 
wavelet packets and the ability of Independent Component Analysis (ICA) to capture 
higher-order statistics. These benefits allow for the construction of a localized non-linear 
shape model with clinically relevant modes of deformation. Experiments were conducted 
to evaluate the performance of the new model in comparison to the Point Distribution 


Model (PDM) and other related methods in the literature. 


Localized vertebral shape analysis and classification 

The thesis also presented a novel framework for the analysis and classification of 
vertebral structures by using pathology-related Wavelet-ICA shape parameters. Within a 
statistical framework, we were able to establish a statistically relevant relationship 
between the Wavelet-ICA modes of deformations and clinical information. We also 
presented a novel method for multi-vertebrae classification using Wavelet-ICA 
parameters and Fisher classifier. Thanks to the locality of the features, the new 
classification scheme is able to assess the condition of several vertebrae simultaneously. 
Experiments were conducted to evaluate the performance of the proposed vertebral 
quantification methods. Our results show that the proposed framework achieves a higher 


performance than the related frameworks in the literature. 
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Segmentation and retrieval of anatomical structures in x-ray images 

We also presented a novel method for the segmentation of vertebral structures in x-ray 
images. The new method incorporates the following: (i) a novel salient point detection 
method in the Non Sub-sampled Contourlet domain, (ii) Contourlet-based local 
appearance profiles for matching salient points, and (iii) a novel multi-scale shape prior 
used to drive the segmentation process. Experiments were conducted to evaluate the 


accuracy of the new segmentation method. 


In addition to segmentation, we also used the proposed contourlet-based salient point 
detection method in a content-based image retrieval task. The proposed method benefits 
from the robustness and stability of the extracted salient points, and the rich Contourlet 
descriptors. We obtained promising results when we applied our method to the IRMA 
(Image Retrieval in Medical Applications) database [125]. This work represents the first 
contribution from the Medical group at the University of Alberta to the IRMA project. 


Enhanced modelling of texture appearance 

Finally, we tackled the challenges related to the accurate modeling of texture and sacristy 
of training data in medical imaging. We presented a new sparse texture appearance model 
based upon the Contourlet Transform and Independent Component Analysis (ICA). The 
new model benefits from the non-linear approximation power of Contourlets to achieve 
high reduction rates. The statistical independence of Independent Component Analysis 
(ICA) was employed to capture localized texture variations. We presented a general 
framework for the synthesis of photo-realistic spine images with various pathological 
deformations. The proposed framework integrates the shape and texture models 
developed in this thesis. Experiments were conducted to evaluate the performance of the 


Contourlet enhanced texture model and the image synthesis framework. 
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8.2 Future research directions 


The following discusses some of the remaining challenges and future research directions 
related to the work presented in this thesis. 


Role of sparsity in medical imaging and computer vision 

Throughout .this thesis. we tried to bring recent advances in computational harmonic 
analysis to the field of medical imaging. We were motivated by two factors: (i) our 
understanding of the Human Visual System and the role of sparsity and multi-scale 
representation in early vision stages, and (ii) recent advances im the field of harmonic 
analysis and the prospects they bring to the fields of computer vision and medical 
imaging. The sparsity of a representation has proven useful in classical signal processing 
problems such as data compression. In medical imaging and computer vision. we ate 
typically concerned with more sophisticated tasks related to image interpretation and 
inference. Hence, there is a growing demand for a rigorous investigation of the prospects 
of sparsity in various medical imaging and computer vision tasks. 


Statistical modeling of shape and texture 
- In an attempt to overcome the limitations of PCA-based models. this thesis 
focused on using Independent Component Analysis (ICA) for the statistical 
modeling of shape and texture variations of anatomical structures. In the future. 
one might investigate the performance of other non-lmear generative models. 


- In chapter seven, we described a novel Contourlet enhanced texture appearance 
model. An interesting direction for future investigation would be to use this 
model for the segmentation of structures in high-resolution images. 


- The image synthesis framework presented m chapter seven is capable of 
generating images with specific pathological deformations. A potential extension 
to this work is to use the proposed image synthesis framework to validaie 
different segmentation and registration algorithms. 


Quantification of vertebral structures 
- In chapter five, we presented a framework for vertebral analysis and 


classification using parameters of the multi-scale shape model. A future 
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extension to this framework would be to include the parameters of the localized 
texture appearance model presented in chapter seven. Pathologies tend to be 
localized in both shape and appearance. Thus, we anticipate that the integration 
of localized shape and texture parameters would enhance the overall performance 


of the classification scheme. 


Pathology at a specific vertebra is likely to affect other neighbouring vertebrae. 
Thus, a possible extension to the work presented in chapter five would be to 
include information about neighbouring vertebrae in the classification 


framework. 


Finally, the proposed quantification framework can be used to assess the 


condition of the human spine with respect to other pathologies such as fractures. 


Content-based medical image retrieval 


Over the past decade, researchers have been increasingly interested in content-based 


medical image retrieval. Most of the methods presented in this thesis can be very 


beneficial in medical image retrieval tasks. 


In chapter four, we described a new multi-scale shape model with clinically- 
relevant modes of deformations. A promising application for this model is to use 
the localized model parameters to develop an efficient pathology-related image 


retrieval system. 


In chapter six, we proposed a novel Contourlet-based salient point detection 
method for the retrieval of images with different classes. Experiments were 
conducted by using a subset of the IRMA database. In the future, we need to 
conduct a more thorough investigation of the prospects of this approach by using 
a larger database. Also, one might consider combining this approach with other 


image-indexing techniques in the literature. 


A Unified registration, segmentation, and modeling approach 


The problem of non-rigid registration is closely tied to the methods presented in this 


work. In this thesis, the main focus was the statistical modelling problem for 


segmentation and analysis applications. A possible future research direction would be to 
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work on a unified framework that integrates non-rigid registration methods with the 


modeling and segmentation methods presented in this thesis. 


Extensions to other application domains 
The methods developed throughout the thesis were evaluated mainly within the context 
of medical imaging. However, most of these methods can be applied to other domains. 

- The proposed Contourlet-based salient point detection and matching algorithm 


will be beneficial in other computer vision applications such as object tracking. 


- The multi-scale shape representation and modeling approach presented in chapter 
four can be used in advanced video processing techniques related to object-based 


predictive coding. 


- The synthesis of biometric data has recently received increasing attention as this 
application opens the door for enhanced security. The multi-scale shape and 
texture models proposed in this thesis can be used for the synthesis of artificial 


biometric data such as the face, iris, and fingerprints. 


8.3 Final Statement 


This thesis was motivated mainly by the growing demand to develop computer-assisted 
methods for medical image interpretation and diagnosis. To achieve our goals, we 
presented new methods for the multi-scale statistical modeling of shape and texture. We 
were concerned mainly with the problems of the segmentation and quantification of 
vertebral structures. We hope that the contributions presented in this thesis will help 
medical practitioners to improve the diagnosis and tracking of diseases. Also, in a wider 
sense, we hope that this work will help advance the research efforts to develop efficient 


model-based computer vision systems. 
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APPENDIX A: Aligning Shapes 


In this appendix, we describe the procedure to align a set of shapes. 


Aligning two shapes: 
Given two shapes X, and*,, we want to estimate the transformation parameters that 
best align the two shapes with respect to a weighted sum of the square distance. The 


transformation parameters are rotation @ , scale S , and translation t = (t wel oD 


The weighted mean square error is defined as 


EF =([x,-M(s,6@)(x%) —t]' W[x, = (SO) tle at) 


where : 


x aay) 
M0) }-{ ‘ } 

y a,xta,y 
a,=scos(@) and a, =ssin(@). 


The least square solution is obtained by differentiating equation A.1 with respect to each 
of the parameters. This procedure produces a set of equations that can be written in 


matrix form as follows: 


> et BA Onniea oh 
VGN, Ce aN Ban all eek, ¥ 
‘ = ; (A.2) 
PSs OO Xe re G: 
0 V ENGNG * Cc. 
where 
N-1 N-1 N-1 
X,= WX» Y,= WiYiy > Li tm Wey eas 
j=0 j=0 j=0 
N-1 N-1 N-1 
ee Wi 7 C= W(X ,%j +1j;¥2;)> C, =)w, (y, x Dey) ayia): 
k=0 j=0 j=0 


Equation A.2 is then solved by using standard matrix methods. 
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Aligning multiple shapes: 


A set of shapes is then aligned in the following iterative approach: 


1. Select a shape as an estimate of the mean shape. 
2. Align all remaining shapes to the estimated mean shape. 
3. Re-estimate the mean shape from the aligned shapes. 


4. Go to step 2 until convergence. 
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APPENDIX B: Parameters Update for ASM 
Search 


This appendix describes the procedure for updating the shape parameters in the Active 
Shape Model (ASM) image search. The shape parameters in ASM include 


The pose parameters: (scale S , rotation @ , translation t i. 


The PCA-model parameter: (weights of eigenvectors b ). 


We start with an initial shape estimation %; with the initial pose parameters (5; , 6. ,t;), 


and the initial PCA-model parameters set to zero (X,, = X+ gb, b=0 Vy 


x,=M(s,,0)[x]+t,. (B.2) 


Let dx, be the suggested movement of the current shape estimate. Then we want to 


update the model parameters so that the current shape estimate best fits the suggested 
movement in the image: 


(ds,(1+d0),dt,db) 
Jip me A N. 


I 


(B.3) 


Find new pose parameters: 
The current estimate of the pose parameters is first updated to best fit the image. This step 


is done by finding the displacements to the pose parameters ( 45, (1 + d@), dt ) by using 
the method in Appendix A: 


(ds,(1+d@),dt) 
I eee NSO 


I 


(B.4) 


(ds,(1+d@),dt) 


x; =M(s,,0)[x]+t, > x,+dx,. (B.5) 


Find new PCA-model parameters: 
Knowing the pose displacements, (ds, (1+ d@), dt) , we solve for the PCA-model 


parameters (dx, ): 


M (s(1+ds), (0+ d6))[x+dx,]+(t+dt)=x+dx (B.6) 
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M(s,(1+ds ),(6, +d6))[x+dx,,]=M(s,,6)[x]+dx, —dt . 


Since M~'(s,0)[.]=M(s',—6)[.], we have 


x+dx, =M((s,(+ds))",-(6, + d6)).[M (s,,6,)[x]+dx—dt]. 


Then 
dx, =M((s,(1+ds))"',—(6, +d@)).[M (s,,0,)[x]+dx—dt]- x 


x+ dx, = x+ddb. 
Thus, db is given by:db = @’ dx,. 


During the image search, the parameters are updated in an iterative manner as 


tu Ww dt.. 


i+] 


6. 


i+] 


> 6 +W,dé,, 
Seas Clie ds), 


b,; 


i+] 


—> b, +W,db,. 


(B.7) 


(B.8) 


(B.9) 


(B.10) 


(B.11) 


(B.12) 


(B.13) 


(B.14) 
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